xref: /petsc/src/mat/impls/aij/mpi/aijperm/mpiaijperm.c (revision 938d9b047269c8ce7cdcee3b5e5291f4da012931)
1*938d9b04SRichard Tran Mills 
2*938d9b04SRichard Tran Mills #include <../src/mat/impls/aij/mpi/mpiaij.h>
3*938d9b04SRichard Tran Mills /*@C
4*938d9b04SRichard Tran Mills    MatCreateMPIAIJPERM - Creates a sparse parallel matrix whose local
5*938d9b04SRichard Tran Mills    portions are stored as SEQAIJPERM matrices (a matrix class that inherits
6*938d9b04SRichard Tran Mills    from SEQAIJ but includes some optimizations to allow more effective
7*938d9b04SRichard Tran Mills    vectorization).  The same guidelines that apply to MPIAIJ matrices for
8*938d9b04SRichard Tran Mills    preallocating the matrix storage apply here as well.
9*938d9b04SRichard Tran Mills 
10*938d9b04SRichard Tran Mills       Collective on MPI_Comm
11*938d9b04SRichard Tran Mills 
12*938d9b04SRichard Tran Mills    Input Parameters:
13*938d9b04SRichard Tran Mills +  comm - MPI communicator
14*938d9b04SRichard Tran Mills .  m - number of local rows (or PETSC_DECIDE to have calculated if M is given)
15*938d9b04SRichard Tran Mills            This value should be the same as the local size used in creating the
16*938d9b04SRichard Tran Mills            y vector for the matrix-vector product y = Ax.
17*938d9b04SRichard Tran Mills .  n - This value should be the same as the local size used in creating the
18*938d9b04SRichard Tran Mills        x vector for the matrix-vector product y = Ax. (or PETSC_DECIDE to have
19*938d9b04SRichard Tran Mills        calculated if N is given) For square matrices n is almost always m.
20*938d9b04SRichard Tran Mills .  M - number of global rows (or PETSC_DETERMINE to have calculated if m is given)
21*938d9b04SRichard Tran Mills .  N - number of global columns (or PETSC_DETERMINE to have calculated if n is given)
22*938d9b04SRichard Tran Mills .  d_nz  - number of nonzeros per row in DIAGONAL portion of local submatrix
23*938d9b04SRichard Tran Mills            (same value is used for all local rows)
24*938d9b04SRichard Tran Mills .  d_nnz - array containing the number of nonzeros in the various rows of the
25*938d9b04SRichard Tran Mills            DIAGONAL portion of the local submatrix (possibly different for each row)
26*938d9b04SRichard Tran Mills            or NULL, if d_nz is used to specify the nonzero structure.
27*938d9b04SRichard Tran Mills            The size of this array is equal to the number of local rows, i.e 'm'.
28*938d9b04SRichard Tran Mills            For matrices you plan to factor you must leave room for the diagonal entry and
29*938d9b04SRichard Tran Mills            put in the entry even if it is zero.
30*938d9b04SRichard Tran Mills .  o_nz  - number of nonzeros per row in the OFF-DIAGONAL portion of local
31*938d9b04SRichard Tran Mills            submatrix (same value is used for all local rows).
32*938d9b04SRichard Tran Mills -  o_nnz - array containing the number of nonzeros in the various rows of the
33*938d9b04SRichard Tran Mills            OFF-DIAGONAL portion of the local submatrix (possibly different for
34*938d9b04SRichard Tran Mills            each row) or NULL, if o_nz is used to specify the nonzero
35*938d9b04SRichard Tran Mills            structure. The size of this array is equal to the number
36*938d9b04SRichard Tran Mills            of local rows, i.e 'm'.
37*938d9b04SRichard Tran Mills 
38*938d9b04SRichard Tran Mills    Output Parameter:
39*938d9b04SRichard Tran Mills .  A - the matrix
40*938d9b04SRichard Tran Mills 
41*938d9b04SRichard Tran Mills    Notes:
42*938d9b04SRichard Tran Mills    If the *_nnz parameter is given then the *_nz parameter is ignored
43*938d9b04SRichard Tran Mills 
44*938d9b04SRichard Tran Mills    m,n,M,N parameters specify the size of the matrix, and its partitioning across
45*938d9b04SRichard Tran Mills    processors, while d_nz,d_nnz,o_nz,o_nnz parameters specify the approximate
46*938d9b04SRichard Tran Mills    storage requirements for this matrix.
47*938d9b04SRichard Tran Mills 
48*938d9b04SRichard Tran Mills    If PETSC_DECIDE or  PETSC_DETERMINE is used for a particular argument on one
49*938d9b04SRichard Tran Mills    processor than it must be used on all processors that share the object for
50*938d9b04SRichard Tran Mills    that argument.
51*938d9b04SRichard Tran Mills 
52*938d9b04SRichard Tran Mills    The user MUST specify either the local or global matrix dimensions
53*938d9b04SRichard Tran Mills    (possibly both).
54*938d9b04SRichard Tran Mills 
55*938d9b04SRichard Tran Mills    The parallel matrix is partitioned such that the first m0 rows belong to
56*938d9b04SRichard Tran Mills    process 0, the next m1 rows belong to process 1, the next m2 rows belong
57*938d9b04SRichard Tran Mills    to process 2 etc.. where m0,m1,m2... are the input parameter 'm'.
58*938d9b04SRichard Tran Mills 
59*938d9b04SRichard Tran Mills    The DIAGONAL portion of the local submatrix of a processor can be defined
60*938d9b04SRichard Tran Mills    as the submatrix which is obtained by extraction the part corresponding
61*938d9b04SRichard Tran Mills    to the rows r1-r2 and columns r1-r2 of the global matrix, where r1 is the
62*938d9b04SRichard Tran Mills    first row that belongs to the processor, and r2 is the last row belonging
63*938d9b04SRichard Tran Mills    to the this processor. This is a square mxm matrix. The remaining portion
64*938d9b04SRichard Tran Mills    of the local submatrix (mxN) constitute the OFF-DIAGONAL portion.
65*938d9b04SRichard Tran Mills 
66*938d9b04SRichard Tran Mills    If o_nnz, d_nnz are specified, then o_nz, and d_nz are ignored.
67*938d9b04SRichard Tran Mills 
68*938d9b04SRichard Tran Mills    When calling this routine with a single process communicator, a matrix of
69*938d9b04SRichard Tran Mills    type SEQAIJPERM is returned.  If a matrix of type MPIAIJPERM is desired
70*938d9b04SRichard Tran Mills    for this type of communicator, use the construction mechanism:
71*938d9b04SRichard Tran Mills      MatCreate(...,&A); MatSetType(A,MPIAIJ); MatMPIAIJSetPreallocation(A,...);
72*938d9b04SRichard Tran Mills 
73*938d9b04SRichard Tran Mills    By default, this format uses inodes (identical nodes) when possible.
74*938d9b04SRichard Tran Mills    We search for consecutive rows with the same nonzero structure, thereby
75*938d9b04SRichard Tran Mills    reusing matrix information to achieve increased efficiency.
76*938d9b04SRichard Tran Mills 
77*938d9b04SRichard Tran Mills    Options Database Keys:
78*938d9b04SRichard Tran Mills +  -mat_no_inode  - Do not use inodes
79*938d9b04SRichard Tran Mills -  -mat_inode_limit <limit> - Sets inode limit (max limit=5)
80*938d9b04SRichard Tran Mills 
81*938d9b04SRichard Tran Mills    Level: intermediate
82*938d9b04SRichard Tran Mills 
83*938d9b04SRichard Tran Mills .keywords: matrix, cray, sparse, parallel
84*938d9b04SRichard Tran Mills 
85*938d9b04SRichard Tran Mills .seealso: MatCreate(), MatCreateSeqAIJPERM(), MatSetValues()
86*938d9b04SRichard Tran Mills @*/
87*938d9b04SRichard Tran Mills PetscErrorCode  MatCreateMPIAIJPERM(MPI_Comm comm,PetscInt m,PetscInt n,PetscInt M,PetscInt N,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const PetscInt o_nnz[],Mat *A)
88*938d9b04SRichard Tran Mills {
89*938d9b04SRichard Tran Mills   PetscErrorCode ierr;
90*938d9b04SRichard Tran Mills   PetscMPIInt    size;
91*938d9b04SRichard Tran Mills 
92*938d9b04SRichard Tran Mills   PetscFunctionBegin;
93*938d9b04SRichard Tran Mills   ierr = MatCreate(comm,A);CHKERRQ(ierr);
94*938d9b04SRichard Tran Mills   ierr = MatSetSizes(*A,m,n,M,N);CHKERRQ(ierr);
95*938d9b04SRichard Tran Mills   ierr = MPI_Comm_size(comm,&size);CHKERRQ(ierr);
96*938d9b04SRichard Tran Mills   if (size > 1) {
97*938d9b04SRichard Tran Mills     ierr = MatSetType(*A,MATMPIAIJPERM);CHKERRQ(ierr);
98*938d9b04SRichard Tran Mills     ierr = MatMPIAIJSetPreallocation(*A,d_nz,d_nnz,o_nz,o_nnz);CHKERRQ(ierr);
99*938d9b04SRichard Tran Mills   } else {
100*938d9b04SRichard Tran Mills     ierr = MatSetType(*A,MATSEQAIJPERM);CHKERRQ(ierr);
101*938d9b04SRichard Tran Mills     ierr = MatSeqAIJSetPreallocation(*A,d_nz,d_nnz);CHKERRQ(ierr);
102*938d9b04SRichard Tran Mills   }
103*938d9b04SRichard Tran Mills   PetscFunctionReturn(0);
104*938d9b04SRichard Tran Mills }
105*938d9b04SRichard Tran Mills 
106*938d9b04SRichard Tran Mills PetscErrorCode  MatMPIAIJSetPreallocation_MPIAIJPERM(Mat B,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const PetscInt o_nnz[])
107*938d9b04SRichard Tran Mills {
108*938d9b04SRichard Tran Mills   Mat_MPIAIJ     *b = (Mat_MPIAIJ*)B->data;
109*938d9b04SRichard Tran Mills   PetscErrorCode ierr;
110*938d9b04SRichard Tran Mills 
111*938d9b04SRichard Tran Mills   PetscFunctionBegin;
112*938d9b04SRichard Tran Mills   ierr = MatMPIAIJSetPreallocation_MPIAIJ(B,d_nz,d_nnz,o_nz,o_nnz);CHKERRQ(ierr);
113*938d9b04SRichard Tran Mills   ierr = MatConvert_SeqAIJ_SeqAIJPERM(b->A, MATSEQAIJPERM, MAT_INPLACE_MATRIX, &b->A);CHKERRQ(ierr);
114*938d9b04SRichard Tran Mills   ierr = MatConvert_SeqAIJ_SeqAIJPERM(b->B, MATSEQAIJPERM, MAT_INPLACE_MATRIX, &b->B);CHKERRQ(ierr);
115*938d9b04SRichard Tran Mills   PetscFunctionReturn(0);
116*938d9b04SRichard Tran Mills }
117*938d9b04SRichard Tran Mills 
118*938d9b04SRichard Tran Mills PETSC_INTERN PetscErrorCode MatConvert_MPIAIJ_MPIAIJPERM(Mat A,MatType type,MatReuse reuse,Mat *newmat)
119*938d9b04SRichard Tran Mills {
120*938d9b04SRichard Tran Mills   PetscErrorCode ierr;
121*938d9b04SRichard Tran Mills   Mat            B = *newmat;
122*938d9b04SRichard Tran Mills 
123*938d9b04SRichard Tran Mills   PetscFunctionBegin;
124*938d9b04SRichard Tran Mills   if (reuse == MAT_INITIAL_MATRIX) {
125*938d9b04SRichard Tran Mills     ierr = MatDuplicate(A,MAT_COPY_VALUES,&B);CHKERRQ(ierr);
126*938d9b04SRichard Tran Mills   }
127*938d9b04SRichard Tran Mills 
128*938d9b04SRichard Tran Mills   ierr = PetscObjectChangeTypeName((PetscObject) B, MATMPIAIJPERM);CHKERRQ(ierr);
129*938d9b04SRichard Tran Mills   ierr = PetscObjectComposeFunction((PetscObject)B,"MatMPIAIJSetPreallocation_C",MatMPIAIJSetPreallocation_MPIAIJPERM);CHKERRQ(ierr);
130*938d9b04SRichard Tran Mills   *newmat = B;
131*938d9b04SRichard Tran Mills   PetscFunctionReturn(0);
132*938d9b04SRichard Tran Mills }
133*938d9b04SRichard Tran Mills 
134*938d9b04SRichard Tran Mills PETSC_EXTERN PetscErrorCode MatCreate_MPIAIJPERM(Mat A)
135*938d9b04SRichard Tran Mills {
136*938d9b04SRichard Tran Mills   PetscErrorCode ierr;
137*938d9b04SRichard Tran Mills 
138*938d9b04SRichard Tran Mills   PetscFunctionBegin;
139*938d9b04SRichard Tran Mills   ierr = MatSetType(A,MATMPIAIJ);CHKERRQ(ierr);
140*938d9b04SRichard Tran Mills   ierr = MatConvert_MPIAIJ_MPIAIJPERM(A,MATMPIAIJPERM,MAT_INPLACE_MATRIX,&A);CHKERRQ(ierr);
141*938d9b04SRichard Tran Mills   PetscFunctionReturn(0);
142*938d9b04SRichard Tran Mills }
143*938d9b04SRichard Tran Mills 
144*938d9b04SRichard Tran Mills /*MC
145*938d9b04SRichard Tran Mills    MATAIJPERM - MATAIJPERM = "AIJPERM" - A matrix type to be used for sparse matrices.
146*938d9b04SRichard Tran Mills 
147*938d9b04SRichard Tran Mills    This matrix type is identical to MATSEQAIJPERM when constructed with a single process communicator,
148*938d9b04SRichard Tran Mills    and MATMPIAIJPERM otherwise.  As a result, for single process communicators,
149*938d9b04SRichard Tran Mills   MatSeqAIJSetPreallocation() is supported, and similarly MatMPIAIJSetPreallocation() is supported
150*938d9b04SRichard Tran Mills   for communicators controlling multiple processes.  It is recommended that you call both of
151*938d9b04SRichard Tran Mills   the above preallocation routines for simplicity.
152*938d9b04SRichard Tran Mills 
153*938d9b04SRichard Tran Mills    Options Database Keys:
154*938d9b04SRichard Tran Mills . -mat_type aijperm - sets the matrix type to "AIJPERM" during a call to MatSetFromOptions()
155*938d9b04SRichard Tran Mills 
156*938d9b04SRichard Tran Mills   Level: beginner
157*938d9b04SRichard Tran Mills 
158*938d9b04SRichard Tran Mills .seealso: MatCreateMPIAIJPERM(), MATSEQAIJPERM, MATMPIAIJPERM
159*938d9b04SRichard Tran Mills M*/
160*938d9b04SRichard Tran Mills 
161