xref: /petsc/src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c (revision d7d8fb0a3e1ba7fc5e1ac806492b28126bd63127)
1a84739b8SRichard Tran Mills #include <../src/mat/impls/aij/mpi/mpiaij.h>
2a84739b8SRichard Tran Mills #undef __FUNCT__
3a84739b8SRichard Tran Mills #define __FUNCT__ "MatCreateMPIAIJMKL"
4a84739b8SRichard Tran Mills /*@C
5a84739b8SRichard Tran Mills    MatCreateMPIAIJMKL - Creates a sparse parallel matrix whose local
6a84739b8SRichard Tran Mills    portions are stored as SEQAIJMKL matrices (a matrix class that inherits
7a84739b8SRichard Tran Mills    from SEQAIJ but uses some operations provided by Intel MKL).  The same
8a84739b8SRichard Tran Mills    guidelines that apply to MPIAIJ matrices for preallocating the matrix
9a84739b8SRichard Tran Mills    storage apply here as well.
10a84739b8SRichard Tran Mills 
11a84739b8SRichard Tran Mills       Collective on MPI_Comm
12a84739b8SRichard Tran Mills 
13a84739b8SRichard Tran Mills    Input Parameters:
14a84739b8SRichard Tran Mills +  comm - MPI communicator
15a84739b8SRichard Tran Mills .  m - number of local rows (or PETSC_DECIDE to have calculated if M is given)
16a84739b8SRichard Tran Mills            This value should be the same as the local size used in creating the
17a84739b8SRichard Tran Mills            y vector for the matrix-vector product y = Ax.
18a84739b8SRichard Tran Mills .  n - This value should be the same as the local size used in creating the
19a84739b8SRichard Tran Mills        x vector for the matrix-vector product y = Ax. (or PETSC_DECIDE to have
20a84739b8SRichard Tran Mills        calculated if N is given) For square matrices n is almost always m.
21a84739b8SRichard Tran Mills .  M - number of global rows (or PETSC_DETERMINE to have calculated if m is given)
22a84739b8SRichard Tran Mills .  N - number of global columns (or PETSC_DETERMINE to have calculated if n is given)
23a84739b8SRichard Tran Mills .  d_nz  - number of nonzeros per row in DIAGONAL portion of local submatrix
24a84739b8SRichard Tran Mills            (same value is used for all local rows)
25a84739b8SRichard Tran Mills .  d_nnz - array containing the number of nonzeros in the various rows of the
26a84739b8SRichard Tran Mills            DIAGONAL portion of the local submatrix (possibly different for each row)
27a84739b8SRichard Tran Mills            or NULL, if d_nz is used to specify the nonzero structure.
28a84739b8SRichard Tran Mills            The size of this array is equal to the number of local rows, i.e 'm'.
29a84739b8SRichard Tran Mills            For matrices you plan to factor you must leave room for the diagonal entry and
30a84739b8SRichard Tran Mills            put in the entry even if it is zero.
31a84739b8SRichard Tran Mills .  o_nz  - number of nonzeros per row in the OFF-DIAGONAL portion of local
32a84739b8SRichard Tran Mills            submatrix (same value is used for all local rows).
33a84739b8SRichard Tran Mills -  o_nnz - array containing the number of nonzeros in the various rows of the
34a84739b8SRichard Tran Mills            OFF-DIAGONAL portion of the local submatrix (possibly different for
35a84739b8SRichard Tran Mills            each row) or NULL, if o_nz is used to specify the nonzero
36a84739b8SRichard Tran Mills            structure. The size of this array is equal to the number
37a84739b8SRichard Tran Mills            of local rows, i.e 'm'.
38a84739b8SRichard Tran Mills 
39a84739b8SRichard Tran Mills    Output Parameter:
40a84739b8SRichard Tran Mills .  A - the matrix
41a84739b8SRichard Tran Mills 
42a84739b8SRichard Tran Mills    Notes:
43a84739b8SRichard Tran Mills    If the *_nnz parameter is given then the *_nz parameter is ignored
44a84739b8SRichard Tran Mills 
45a84739b8SRichard Tran Mills    m,n,M,N parameters specify the size of the matrix, and its partitioning across
46a84739b8SRichard Tran Mills    processors, while d_nz,d_nnz,o_nz,o_nnz parameters specify the approximate
47a84739b8SRichard Tran Mills    storage requirements for this matrix.
48a84739b8SRichard Tran Mills 
49a84739b8SRichard Tran Mills    If PETSC_DECIDE or PETSC_DETERMINE is used for a particular argument on one
50a84739b8SRichard Tran Mills    processor than it must be used on all processors that share the object for
51a84739b8SRichard Tran Mills    that argument.
52a84739b8SRichard Tran Mills 
53a84739b8SRichard Tran Mills    The user MUST specify either the local or global matrix dimensions
54a84739b8SRichard Tran Mills    (possibly both).
55a84739b8SRichard Tran Mills 
56a84739b8SRichard Tran Mills    The parallel matrix is partitioned such that the first m0 rows belong to
57a84739b8SRichard Tran Mills    process 0, the next m1 rows belong to process 1, the next m2 rows belong
58a84739b8SRichard Tran Mills    to process 2 etc.. where m0,m1,m2... are the input parameter 'm'.
59a84739b8SRichard Tran Mills 
60a84739b8SRichard Tran Mills    The DIAGONAL portion of the local submatrix of a processor can be defined
61a84739b8SRichard Tran Mills    as the submatrix which is obtained by extraction the part corresponding
62a84739b8SRichard Tran Mills    to the rows r1-r2 and columns r1-r2 of the global matrix, where r1 is the
63a84739b8SRichard Tran Mills    first row that belongs to the processor, and r2 is the last row belonging
64a84739b8SRichard Tran Mills    to the this processor. This is a square mxm matrix. The remaining portion
65a84739b8SRichard Tran Mills    of the local submatrix (mxN) constitute the OFF-DIAGONAL portion.
66a84739b8SRichard Tran Mills 
67a84739b8SRichard Tran Mills    If o_nnz, d_nnz are specified, then o_nz, and d_nz are ignored.
68a84739b8SRichard Tran Mills 
69a84739b8SRichard Tran Mills    When calling this routine with a single process communicator, a matrix of
70a84739b8SRichard Tran Mills    type SEQAIJMKL is returned.  If a matrix of type MPIAIJMKL is desired
71a84739b8SRichard Tran Mills    for this type of communicator, use the construction mechanism:
72*d7d8fb0aSRichard Tran Mills      MatCreate(...,&A); MatSetType(A,MPIAIJMKL); MatMPIAIJSetPreallocation(A,...);
73a84739b8SRichard Tran Mills 
74a84739b8SRichard Tran Mills    Options Database Keys:
75*d7d8fb0aSRichard Tran Mills .  -mat_aijmkl_no_spmv2 - disables use of the SpMV2 inspector-executor routines
76a84739b8SRichard Tran Mills 
77a84739b8SRichard Tran Mills    Level: intermediate
78a84739b8SRichard Tran Mills 
79*d7d8fb0aSRichard Tran Mills .keywords: matrix, MKL, sparse, parallel
80a84739b8SRichard Tran Mills 
81a84739b8SRichard Tran Mills .seealso: MatCreate(), MatCreateSeqAIJMKL(), MatSetValues()
82a84739b8SRichard Tran Mills @*/
83a84739b8SRichard Tran Mills PetscErrorCode  MatCreateMPIAIJMKL(MPI_Comm comm,PetscInt m,PetscInt n,PetscInt M,PetscInt N,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const PetscInt o_nnz[],Mat *A)
84a84739b8SRichard Tran Mills {
85a84739b8SRichard Tran Mills   PetscErrorCode ierr;
86a84739b8SRichard Tran Mills   PetscMPIInt    size;
87a84739b8SRichard Tran Mills 
88a84739b8SRichard Tran Mills   PetscFunctionBegin;
89a84739b8SRichard Tran Mills   ierr = MatCreate(comm,A);CHKERRQ(ierr);
90a84739b8SRichard Tran Mills   ierr = MatSetSizes(*A,m,n,M,N);CHKERRQ(ierr);
91a84739b8SRichard Tran Mills   ierr = MPI_Comm_size(comm,&size);CHKERRQ(ierr);
92a84739b8SRichard Tran Mills   if (size > 1) {
93a84739b8SRichard Tran Mills     ierr = MatSetType(*A,MATMPIAIJMKL);CHKERRQ(ierr);
94a84739b8SRichard Tran Mills     ierr = MatMPIAIJSetPreallocation(*A,d_nz,d_nnz,o_nz,o_nnz);CHKERRQ(ierr);
95a84739b8SRichard Tran Mills   } else {
96a84739b8SRichard Tran Mills     ierr = MatSetType(*A,MATSEQAIJMKL);CHKERRQ(ierr);
97a84739b8SRichard Tran Mills     ierr = MatSeqAIJSetPreallocation(*A,d_nz,d_nnz);CHKERRQ(ierr);
98a84739b8SRichard Tran Mills   }
99a84739b8SRichard Tran Mills   PetscFunctionReturn(0);
100a84739b8SRichard Tran Mills }
101a84739b8SRichard Tran Mills 
102a84739b8SRichard Tran Mills PETSC_INTERN PetscErrorCode MatConvert_SeqAIJ_SeqAIJMKL(Mat,MatType,MatReuse,Mat*);
103a84739b8SRichard Tran Mills 
104a84739b8SRichard Tran Mills #undef __FUNCT__
105a84739b8SRichard Tran Mills #define __FUNCT__ "MatMPIAIJSetPreallocation_MPIAIJMKL"
106a84739b8SRichard Tran Mills PetscErrorCode  MatMPIAIJSetPreallocation_MPIAIJMKL(Mat B,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const PetscInt o_nnz[])
107a84739b8SRichard Tran Mills {
108a84739b8SRichard Tran Mills   Mat_MPIAIJ     *b = (Mat_MPIAIJ*)B->data;
109a84739b8SRichard Tran Mills   PetscErrorCode ierr;
110a84739b8SRichard Tran Mills 
111a84739b8SRichard Tran Mills   PetscFunctionBegin;
112a84739b8SRichard Tran Mills   ierr = MatMPIAIJSetPreallocation_MPIAIJ(B,d_nz,d_nnz,o_nz,o_nnz);CHKERRQ(ierr);
113a84739b8SRichard Tran Mills   ierr = MatConvert_SeqAIJ_SeqAIJMKL(b->A, MATSEQAIJMKL, MAT_INPLACE_MATRIX, &b->A);CHKERRQ(ierr);
114a84739b8SRichard Tran Mills   ierr = MatConvert_SeqAIJ_SeqAIJMKL(b->B, MATSEQAIJMKL, MAT_INPLACE_MATRIX, &b->B);CHKERRQ(ierr);
115a84739b8SRichard Tran Mills   PetscFunctionReturn(0);
116a84739b8SRichard Tran Mills }
117a84739b8SRichard Tran Mills 
118a84739b8SRichard Tran Mills #undef __FUNCT__
119a84739b8SRichard Tran Mills #define __FUNCT__ "MatConvert_MPIAIJ_MPIAIJMKL"
120a84739b8SRichard Tran Mills PETSC_INTERN PetscErrorCode MatConvert_MPIAIJ_MPIAIJMKL(Mat A,MatType type,MatReuse reuse,Mat *newmat)
121a84739b8SRichard Tran Mills {
122a84739b8SRichard Tran Mills   PetscErrorCode ierr;
123a84739b8SRichard Tran Mills   Mat            B = *newmat;
124a84739b8SRichard Tran Mills 
125a84739b8SRichard Tran Mills   PetscFunctionBegin;
126a84739b8SRichard Tran Mills   if (reuse == MAT_INITIAL_MATRIX) {
127a84739b8SRichard Tran Mills     ierr = MatDuplicate(A,MAT_COPY_VALUES,&B);CHKERRQ(ierr);
128a84739b8SRichard Tran Mills   }
129a84739b8SRichard Tran Mills 
130a84739b8SRichard Tran Mills   ierr = PetscObjectChangeTypeName((PetscObject) B, MATMPIAIJMKL);CHKERRQ(ierr);
131a84739b8SRichard Tran Mills   ierr = PetscObjectComposeFunction((PetscObject)B,"MatMPIAIJSetPreallocation_C",MatMPIAIJSetPreallocation_MPIAIJMKL);CHKERRQ(ierr);
132a84739b8SRichard Tran Mills   *newmat = B;
133a84739b8SRichard Tran Mills   PetscFunctionReturn(0);
134a84739b8SRichard Tran Mills }
135a84739b8SRichard Tran Mills 
136a84739b8SRichard Tran Mills #undef __FUNCT__
137a84739b8SRichard Tran Mills #define __FUNCT__ "MatCreate_MPIAIJMKL"
138a84739b8SRichard Tran Mills PETSC_EXTERN PetscErrorCode MatCreate_MPIAIJMKL(Mat A)
139a84739b8SRichard Tran Mills {
140a84739b8SRichard Tran Mills   PetscErrorCode ierr;
141a84739b8SRichard Tran Mills 
142a84739b8SRichard Tran Mills   PetscFunctionBegin;
143a84739b8SRichard Tran Mills   ierr = MatSetType(A,MATMPIAIJ);CHKERRQ(ierr);
144a84739b8SRichard Tran Mills   ierr = MatConvert_MPIAIJ_MPIAIJMKL(A,MATMPIAIJMKL,MAT_INPLACE_MATRIX,&A);CHKERRQ(ierr);
145a84739b8SRichard Tran Mills   PetscFunctionReturn(0);
146a84739b8SRichard Tran Mills }
147a84739b8SRichard Tran Mills 
148a84739b8SRichard Tran Mills /*MC
149a84739b8SRichard Tran Mills    MATAIJMKL - MATAIJMKL = "AIJMKL" - A matrix type to be used for sparse matrices.
150a84739b8SRichard Tran Mills 
151a84739b8SRichard Tran Mills    This matrix type is identical to MATSEQAIJMKL when constructed with a single process communicator,
152a84739b8SRichard Tran Mills    and MATMPIAIJMKL otherwise.  As a result, for single process communicators,
153a84739b8SRichard Tran Mills   MatSeqAIJSetPreallocation() is supported, and similarly MatMPIAIJSetPreallocation() is supported
154a84739b8SRichard Tran Mills   for communicators controlling multiple processes.  It is recommended that you call both of
155a84739b8SRichard Tran Mills   the above preallocation routines for simplicity.
156a84739b8SRichard Tran Mills 
157a84739b8SRichard Tran Mills    Options Database Keys:
158a84739b8SRichard Tran Mills . -mat_type aijmkl - sets the matrix type to "AIJMKL" during a call to MatSetFromOptions()
159a84739b8SRichard Tran Mills 
160a84739b8SRichard Tran Mills   Level: beginner
161a84739b8SRichard Tran Mills 
162a84739b8SRichard Tran Mills .seealso: MatCreateMPIAIJMKL(), MATSEQAIJMKL, MATMPIAIJMKL
163a84739b8SRichard Tran Mills M*/
164a84739b8SRichard Tran Mills 
165