xref: /petsc/doc/faq/index.md (revision 9b92b1d3a7bbc5a081edc9021bfd15a36804dd1c)
1*9b92b1d3SBarry Smith(doc_faq)=
2*9b92b1d3SBarry Smith
3*9b92b1d3SBarry Smith# FAQ
4*9b92b1d3SBarry Smith
5*9b92b1d3SBarry Smith```{contents} Table Of Contents
6*9b92b1d3SBarry Smith:backlinks: top
7*9b92b1d3SBarry Smith:local: true
8*9b92b1d3SBarry Smith```
9*9b92b1d3SBarry Smith
10*9b92b1d3SBarry Smith______________________________________________________________________
11*9b92b1d3SBarry Smith
12*9b92b1d3SBarry Smith## General
13*9b92b1d3SBarry Smith
14*9b92b1d3SBarry Smith### How can I subscribe to the PETSc mailing lists?
15*9b92b1d3SBarry Smith
16*9b92b1d3SBarry SmithSee mailing list {ref}`documentation <doc_mail>`
17*9b92b1d3SBarry Smith
18*9b92b1d3SBarry Smith### Any useful books on numerical computing?
19*9b92b1d3SBarry Smith
20*9b92b1d3SBarry Smith[Bueler, PETSc for Partial Differential Equations: Numerical Solutions in C and Python](https://my.siam.org/Store/Product/viewproduct/?ProductId=32850137)
21*9b92b1d3SBarry Smith
22*9b92b1d3SBarry Smith[Oliveira and Stewart, Writing Scientific Software: A Guide to Good Style](https://www.cambridge.org/core/books/writing-scientific-software/23206704175AF868E43FE3FB399C2F53)
23*9b92b1d3SBarry Smith
24*9b92b1d3SBarry Smith(doc_faq_general_parallel)=
25*9b92b1d3SBarry Smith
26*9b92b1d3SBarry Smith### What kind of parallel computers or clusters are needed to use PETSc? Or why do I get little speedup?
27*9b92b1d3SBarry Smith
28*9b92b1d3SBarry Smith:::{important}
29*9b92b1d3SBarry SmithPETSc can be used with any kind of parallel system that supports MPI BUT for any decent
30*9b92b1d3SBarry Smithperformance one needs:
31*9b92b1d3SBarry Smith
32*9b92b1d3SBarry Smith- Fast, **low-latency** interconnect; any ethernet (even 10 GigE) simply cannot provide
33*9b92b1d3SBarry Smith  the needed performance.
34*9b92b1d3SBarry Smith- High per-core **memory** performance. Each core needs to
35*9b92b1d3SBarry Smith  have its **own** memory bandwidth of at least 2 or more gigabytes/second. Most modern
36*9b92b1d3SBarry Smith  computers are not bottlenecked by how fast they can perform
37*9b92b1d3SBarry Smith  calculations; rather, they are usually restricted by how quickly they can get their
38*9b92b1d3SBarry Smith  data.
39*9b92b1d3SBarry Smith:::
40*9b92b1d3SBarry Smith
41*9b92b1d3SBarry SmithTo obtain good performance it is important that you know your machine! I.e. how many
42*9b92b1d3SBarry Smithcompute nodes (generally, how many motherboards), how many memory sockets per node and how
43*9b92b1d3SBarry Smithmany cores per memory socket and how much memory bandwidth for each.
44*9b92b1d3SBarry Smith
45*9b92b1d3SBarry SmithIf you do not know this and can run MPI programs with mpiexec (that is, you don't have
46*9b92b1d3SBarry Smithbatch system), run the following from `$PETSC_DIR`:
47*9b92b1d3SBarry Smith
48*9b92b1d3SBarry Smith```console
49*9b92b1d3SBarry Smith$ make streams [NPMAX=maximum_number_of_mpi_processes_you_plan_to_use]
50*9b92b1d3SBarry Smith```
51*9b92b1d3SBarry Smith
52*9b92b1d3SBarry SmithThis will provide a summary of the bandwidth with different number of MPI
53*9b92b1d3SBarry Smithprocesses and potential speedups. See {any}`ch_streams` for a detailed discussion.
54*9b92b1d3SBarry Smith
55*9b92b1d3SBarry SmithIf you have a batch system:
56*9b92b1d3SBarry Smith
57*9b92b1d3SBarry Smith```console
58*9b92b1d3SBarry Smith$ cd $PETSC_DIR/src/benchmarks/streams
59*9b92b1d3SBarry Smith$ make MPIVersion
60*9b92b1d3SBarry Smithsubmit MPIVersion to the batch system a number of times with 1, 2, 3, etc. MPI processes
61*9b92b1d3SBarry Smithcollecting all of the output from the runs into the single file scaling.log. Copy
62*9b92b1d3SBarry Smithscaling.log into the src/benchmarks/streams directory.
63*9b92b1d3SBarry Smith$ ./process.py createfile ; process.py
64*9b92b1d3SBarry Smith```
65*9b92b1d3SBarry Smith
66*9b92b1d3SBarry SmithEven if you have enough memory bandwidth if the OS switches processes between cores
67*9b92b1d3SBarry Smithperformance can degrade. Smart process to core/socket binding (this just means locking a
68*9b92b1d3SBarry Smithprocess to a particular core or memory socket) may help you. For example, consider using
69*9b92b1d3SBarry Smithfewer processes than cores and binding processes to separate sockets so that each process
70*9b92b1d3SBarry Smithuses a different memory bus:
71*9b92b1d3SBarry Smith
72*9b92b1d3SBarry Smith- [MPICH2 binding with the Hydra process manager](https://github.com/pmodels/mpich/blob/main/doc/wiki/how_to/Using_the_Hydra_Process_Manager.md#process-core-binding]
73*9b92b1d3SBarry Smith
74*9b92b1d3SBarry Smith  ```console
75*9b92b1d3SBarry Smith  $ mpiexec.hydra -n 4 --binding cpu:sockets
76*9b92b1d3SBarry Smith  ```
77*9b92b1d3SBarry Smith
78*9b92b1d3SBarry Smith- [Open MPI binding](https://www.open-mpi.org/faq/?category=tuning#using-paffinity)
79*9b92b1d3SBarry Smith
80*9b92b1d3SBarry Smith  ```console
81*9b92b1d3SBarry Smith  $ mpiexec -n 4 --map-by socket --bind-to socket --report-bindings
82*9b92b1d3SBarry Smith  ```
83*9b92b1d3SBarry Smith
84*9b92b1d3SBarry Smith- `taskset`, part of the [util-linux](https://github.com/karelzak/util-linux) package
85*9b92b1d3SBarry Smith
86*9b92b1d3SBarry Smith  Check `man taskset` for details. Make sure to set affinity for **your** program,
87*9b92b1d3SBarry Smith  **not** for the `mpiexec` program.
88*9b92b1d3SBarry Smith
89*9b92b1d3SBarry Smith- `numactl`
90*9b92b1d3SBarry Smith
91*9b92b1d3SBarry Smith  In addition to task affinity, this tool also allows changing the default memory affinity
92*9b92b1d3SBarry Smith  policy. On Linux, the default policy is to attempt to find memory on the same memory bus
93*9b92b1d3SBarry Smith  that serves the core that a thread is running on when the memory is faulted
94*9b92b1d3SBarry Smith  (not when `malloc()` is called). If local memory is not available, it is found
95*9b92b1d3SBarry Smith  elsewhere, possibly leading to serious memory imbalances.
96*9b92b1d3SBarry Smith
97*9b92b1d3SBarry Smith  The option `--localalloc` allocates memory on the local NUMA node, similar to the
98*9b92b1d3SBarry Smith  `numa_alloc_local()` function in the `libnuma` library. The option
99*9b92b1d3SBarry Smith  `--cpunodebind=nodes` binds the process to a given NUMA node (note that this can be
100*9b92b1d3SBarry Smith  larger or smaller than a CPU (socket); a NUMA node usually has multiple cores).
101*9b92b1d3SBarry Smith
102*9b92b1d3SBarry Smith  The option `--physcpubind=cpus` binds the process to a given processor core (numbered
103*9b92b1d3SBarry Smith  according to `/proc/cpuinfo`, therefore including logical cores if Hyper-threading is
104*9b92b1d3SBarry Smith  enabled).
105*9b92b1d3SBarry Smith
106*9b92b1d3SBarry Smith  With Open MPI, you can use knowledge of the NUMA hierarchy and core numbering on your
107*9b92b1d3SBarry Smith  machine to calculate the correct NUMA node or processor number given the environment
108*9b92b1d3SBarry Smith  variable `OMPI_COMM_WORLD_LOCAL_RANK`. In most cases, it is easier to make mpiexec or
109*9b92b1d3SBarry Smith  a resource manager set affinities.
110*9b92b1d3SBarry Smith
111*9b92b1d3SBarry SmithThe software [Open-MX](http://open-mx.gforge.inria.fr) provides faster speed for
112*9b92b1d3SBarry Smithethernet systems, we have not tried it but it claims it can dramatically reduce latency
113*9b92b1d3SBarry Smithand increase bandwidth on Linux system. You must first install this software and then
114*9b92b1d3SBarry Smithinstall MPICH or Open MPI to use it.
115*9b92b1d3SBarry Smith
116*9b92b1d3SBarry Smith### What kind of license is PETSc released under?
117*9b92b1d3SBarry Smith
118*9b92b1d3SBarry SmithSee licensing {ref}`documentation <doc_license>`
119*9b92b1d3SBarry Smith
120*9b92b1d3SBarry Smith### Why is PETSc written in C, instead of Fortran or C++?
121*9b92b1d3SBarry Smith
122*9b92b1d3SBarry SmithWhen this decision was made, in the early 1990s, C enabled us to build data structures
123*9b92b1d3SBarry Smithfor storing sparse matrices, solver information,
124*9b92b1d3SBarry Smithetc. in ways that Fortran simply did not allow. ANSI C was a complete standard that all
125*9b92b1d3SBarry Smithmodern C compilers supported. The language was identical on all machines. C++ was still
126*9b92b1d3SBarry Smithevolving and compilers on different machines were not identical. Using C function pointers
127*9b92b1d3SBarry Smithto provide data encapsulation and polymorphism allowed us to get many of the advantages of
128*9b92b1d3SBarry SmithC++ without using such a large and more complicated language. It would have been natural and
129*9b92b1d3SBarry Smithreasonable to have coded PETSc in C++; we opted to use C instead.
130*9b92b1d3SBarry Smith
131*9b92b1d3SBarry Smith### Does all the PETSc error checking and logging reduce PETSc's efficiency?
132*9b92b1d3SBarry Smith
133*9b92b1d3SBarry SmithNo
134*9b92b1d3SBarry Smith
135*9b92b1d3SBarry Smith(doc_faq_maintenance_strats)=
136*9b92b1d3SBarry Smith
137*9b92b1d3SBarry Smith### How do such a small group of people manage to write and maintain such a large and marvelous package as PETSc?
138*9b92b1d3SBarry Smith
139*9b92b1d3SBarry Smith1. **We work very efficiently**.
140*9b92b1d3SBarry Smith
141*9b92b1d3SBarry Smith   - We use powerful editors and programming environments.
142*9b92b1d3SBarry Smith   - Our manual pages are generated automatically from formatted comments in the code,
143*9b92b1d3SBarry Smith     thus alleviating the need for creating and maintaining manual pages.
144*9b92b1d3SBarry Smith   - We employ continuous integration testing of the entire PETSc library on many different
145*9b92b1d3SBarry Smith     machine architectures. This process **significantly** protects (no bug-catching
146*9b92b1d3SBarry Smith     process is perfect) against inadvertently introducing bugs with new additions. Every
147*9b92b1d3SBarry Smith     new feature **must** pass our suite of thousands of tests as well as formal code
148*9b92b1d3SBarry Smith     review before it may be included.
149*9b92b1d3SBarry Smith
150*9b92b1d3SBarry Smith2. **We are very careful in our design (and are constantly revising our design)**
151*9b92b1d3SBarry Smith
152*9b92b1d3SBarry Smith   - PETSc as a package should be easy to use, write, and maintain. Our mantra is to write
153*9b92b1d3SBarry Smith     code like everyone is using it.
154*9b92b1d3SBarry Smith
155*9b92b1d3SBarry Smith3. **We are willing to do the grunt work**
156*9b92b1d3SBarry Smith
157*9b92b1d3SBarry Smith   - PETSc is regularly checked to make sure that all code conforms to our interface
158*9b92b1d3SBarry Smith     design. We will never keep in a bad design decision simply because changing it will
159*9b92b1d3SBarry Smith     require a lot of editing; we do a lot of editing.
160*9b92b1d3SBarry Smith
161*9b92b1d3SBarry Smith4. **We constantly seek out and experiment with new design ideas**
162*9b92b1d3SBarry Smith
163*9b92b1d3SBarry Smith   - We retain the useful ones and discard the rest. All of these decisions are based not
164*9b92b1d3SBarry Smith     just on performance, but also on **practicality**.
165*9b92b1d3SBarry Smith
166*9b92b1d3SBarry Smith5. **Function and variable names must adhere to strict guidelines**
167*9b92b1d3SBarry Smith
168*9b92b1d3SBarry Smith   - Even the rules about capitalization are designed to make it easy to figure out the
169*9b92b1d3SBarry Smith     name of a particular object or routine. Our memories are terrible, so careful
170*9b92b1d3SBarry Smith     consistent naming puts less stress on our limited human RAM.
171*9b92b1d3SBarry Smith
172*9b92b1d3SBarry Smith6. **The PETSc directory tree is designed to make it easy to move throughout the
173*9b92b1d3SBarry Smith   entire package**
174*9b92b1d3SBarry Smith
175*9b92b1d3SBarry Smith7. **We have a rich, robust, and fast bug reporting system**
176*9b92b1d3SBarry Smith
177*9b92b1d3SBarry Smith   - <mailto:petsc-maint@mcs.anl.gov> is always checked, and we pride ourselves on responding
178*9b92b1d3SBarry Smith     quickly and accurately. Email is very lightweight, and so bug reports system retains
179*9b92b1d3SBarry Smith     an archive of all reported problems and fixes, so it is easy to re-find fixes to
180*9b92b1d3SBarry Smith     previously discovered problems.
181*9b92b1d3SBarry Smith
182*9b92b1d3SBarry Smith8. **We contain the complexity of PETSc by using powerful object-oriented programming
183*9b92b1d3SBarry Smith   techniques**
184*9b92b1d3SBarry Smith
185*9b92b1d3SBarry Smith   - Data encapsulation serves to abstract complex data formats or movement to
186*9b92b1d3SBarry Smith     human-readable format. This is why your program cannot, for example, look directly
187*9b92b1d3SBarry Smith     at what is inside the object `Mat`.
188*9b92b1d3SBarry Smith   - Polymorphism makes changing program behavior as easy as possible, and further
189*9b92b1d3SBarry Smith     abstracts the *intent* of your program from what is *written* in code. You call
190*9b92b1d3SBarry Smith     `MatMult()` regardless of whether your matrix is dense, sparse, parallel or
191*9b92b1d3SBarry Smith     sequential; you don't call a different routine for each format.
192*9b92b1d3SBarry Smith
193*9b92b1d3SBarry Smith9. **We try to provide the functionality requested by our users**
194*9b92b1d3SBarry Smith
195*9b92b1d3SBarry Smith### For complex numbers will I get better performance with C++?
196*9b92b1d3SBarry Smith
197*9b92b1d3SBarry SmithTo use PETSc with complex numbers you may use the following `configure` options;
198*9b92b1d3SBarry Smith`--with-scalar-type=complex` and either `--with-clanguage=c++` or (the default)
199*9b92b1d3SBarry Smith`--with-clanguage=c`. In our experience they will deliver very similar performance
200*9b92b1d3SBarry Smith(speed), but if one is concerned they should just try both and see if one is faster.
201*9b92b1d3SBarry Smith
202*9b92b1d3SBarry Smith### How come when I run the same program on the same number of processes I get a "different" answer?
203*9b92b1d3SBarry Smith
204*9b92b1d3SBarry SmithInner products and norms in PETSc are computed using the `MPI_Allreduce()` command. For
205*9b92b1d3SBarry Smithdifferent runs the order at which values arrive at a given process (via MPI) can be in a
206*9b92b1d3SBarry Smithdifferent order, thus the order in which some floating point arithmetic operations are
207*9b92b1d3SBarry Smithperformed will be different. Since floating point arithmetic is not
208*9b92b1d3SBarry Smithassociative, the computed quantity may be slightly different.
209*9b92b1d3SBarry Smith
210*9b92b1d3SBarry SmithOver a run the many slight differences in the inner products and norms will effect all the
211*9b92b1d3SBarry Smithcomputed results. It is important to realize that none of the computed answers are any
212*9b92b1d3SBarry Smithless right or wrong (in fact the sequential computation is no more right then the parallel
213*9b92b1d3SBarry Smithones). All answers are equal, but some are more equal than others.
214*9b92b1d3SBarry Smith
215*9b92b1d3SBarry SmithThe discussion above assumes that the exact same algorithm is being used on the different
216*9b92b1d3SBarry Smithnumber of processes. When the algorithm is different for the different number of processes
217*9b92b1d3SBarry Smith(almost all preconditioner algorithms except Jacobi are different for different number of
218*9b92b1d3SBarry Smithprocesses) then one expects to see (and does) a greater difference in results for
219*9b92b1d3SBarry Smithdifferent numbers of processes. In some cases (for example block Jacobi preconditioner) it
220*9b92b1d3SBarry Smithmay be that the algorithm works for some number of processes and does not work for others.
221*9b92b1d3SBarry Smith
222*9b92b1d3SBarry Smith### How come when I run the same linear solver on a different number of processes it takes a different number of iterations?
223*9b92b1d3SBarry Smith
224*9b92b1d3SBarry SmithThe convergence of many of the preconditioners in PETSc including the default parallel
225*9b92b1d3SBarry Smithpreconditioner block Jacobi depends on the number of processes. The more processes the
226*9b92b1d3SBarry Smith(slightly) slower convergence it has. This is the nature of iterative solvers, the more
227*9b92b1d3SBarry Smithparallelism means the more "older" information is used in the solution process hence
228*9b92b1d3SBarry Smithslower convergence.
229*9b92b1d3SBarry Smith
230*9b92b1d3SBarry Smith(doc_faq_gpuhowto)=
231*9b92b1d3SBarry Smith
232*9b92b1d3SBarry Smith### Can PETSc use GPUs to speed up computations?
233*9b92b1d3SBarry Smith
234*9b92b1d3SBarry Smith:::{seealso}
235*9b92b1d3SBarry SmithSee GPU development {ref}`roadmap <doc_gpu_roadmap>` for the latest information
236*9b92b1d3SBarry Smithregarding the state of PETSc GPU integration.
237*9b92b1d3SBarry Smith
238*9b92b1d3SBarry SmithSee GPU install {ref}`documentation <doc_config_accel>` for up-to-date information on
239*9b92b1d3SBarry Smithinstalling PETSc to use GPU's.
240*9b92b1d3SBarry Smith:::
241*9b92b1d3SBarry Smith
242*9b92b1d3SBarry SmithQuick summary of usage with CUDA:
243*9b92b1d3SBarry Smith
244*9b92b1d3SBarry Smith- The `VecType` `VECSEQCUDA`, `VECMPICUDA`, or `VECCUDA` may be used with
245*9b92b1d3SBarry Smith  `VecSetType()` or `-vec_type seqcuda`, `mpicuda`, or `cuda` when
246*9b92b1d3SBarry Smith  `VecSetFromOptions()` is used.
247*9b92b1d3SBarry Smith- The `MatType` `MATSEQAIJCUSPARSE`, `MATMPIAIJCUSPARSE`, or `MATAIJCUSPARSE`
248*9b92b1d3SBarry Smith  may be used with `MatSetType()` or `-mat_type seqaijcusparse`, `mpiaijcusparse`, or
249*9b92b1d3SBarry Smith  `aijcusparse` when `MatSetFromOptions()` is used.
250*9b92b1d3SBarry Smith- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type
251*9b92b1d3SBarry Smith  cuda` and `-dm_mat_type aijcusparse`.
252*9b92b1d3SBarry Smith
253*9b92b1d3SBarry SmithQuick summary of usage with OpenCL (provided by the ViennaCL library):
254*9b92b1d3SBarry Smith
255*9b92b1d3SBarry Smith- The `VecType` `VECSEQVIENNACL`, `VECMPIVIENNACL`, or `VECVIENNACL` may be used
256*9b92b1d3SBarry Smith  with `VecSetType()` or `-vec_type seqviennacl`, `mpiviennacl`, or `viennacl`
257*9b92b1d3SBarry Smith  when `VecSetFromOptions()` is used.
258*9b92b1d3SBarry Smith- The `MatType` `MATSEQAIJVIENNACL`, `MATMPIAIJVIENNACL`, or `MATAIJVIENNACL`
259*9b92b1d3SBarry Smith  may be used with `MatSetType()` or `-mat_type seqaijviennacl`, `mpiaijviennacl`, or
260*9b92b1d3SBarry Smith  `aijviennacl` when `MatSetFromOptions()` is used.
261*9b92b1d3SBarry Smith- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type
262*9b92b1d3SBarry Smith  viennacl` and `-dm_mat_type aijviennacl`.
263*9b92b1d3SBarry Smith
264*9b92b1d3SBarry SmithGeneral hints:
265*9b92b1d3SBarry Smith
266*9b92b1d3SBarry Smith- It is useful to develop your code with the default vectors and then run production runs
267*9b92b1d3SBarry Smith  with the command line options to use the GPU since debugging on GPUs is difficult.
268*9b92b1d3SBarry Smith- All of the Krylov methods except `KSPIBCGS` run on the GPU.
269*9b92b1d3SBarry Smith- Parts of most preconditioners run directly on the GPU. After setup, `PCGAMG` runs
270*9b92b1d3SBarry Smith  fully on GPUs, without any memory copies between the CPU and GPU.
271*9b92b1d3SBarry Smith
272*9b92b1d3SBarry SmithSome GPU systems (for example many laptops) only run with single precision; thus, PETSc
273*9b92b1d3SBarry Smithmust be built with the `configure` option `--with-precision=single`.
274*9b92b1d3SBarry Smith
275*9b92b1d3SBarry Smith(doc_faq_extendedprecision)=
276*9b92b1d3SBarry Smith
277*9b92b1d3SBarry Smith### Can I run PETSc with extended precision?
278*9b92b1d3SBarry Smith
279*9b92b1d3SBarry SmithYes, with gcc and gfortran. `configure` PETSc using the
280*9b92b1d3SBarry Smithoptions `--with-precision=__float128` and `--download-f2cblaslapack`.
281*9b92b1d3SBarry Smith
282*9b92b1d3SBarry Smith:::{admonition} Warning
283*9b92b1d3SBarry Smith:class: yellow
284*9b92b1d3SBarry Smith
285*9b92b1d3SBarry SmithExternal packages are not guaranteed to work in this mode!
286*9b92b1d3SBarry Smith:::
287*9b92b1d3SBarry Smith
288*9b92b1d3SBarry Smith### Why doesn't PETSc use Qd to implement support for extended precision?
289*9b92b1d3SBarry Smith
290*9b92b1d3SBarry SmithWe tried really hard but could not. The problem is that the QD c++ classes, though they
291*9b92b1d3SBarry Smithtry to, implement the built-in data types of `double` are not native types and cannot
292*9b92b1d3SBarry Smith"just be used" in a general piece of numerical source code. Ratherm the code has to
293*9b92b1d3SBarry Smithrewritten to live within the limitations of QD classes. However PETSc can be built to use
294*9b92b1d3SBarry Smithquad precision, as detailed {ref}`here <doc_faq_extendedprecision>`.
295*9b92b1d3SBarry Smith
296*9b92b1d3SBarry Smith### How do I cite PETSc?
297*9b92b1d3SBarry Smith
298*9b92b1d3SBarry SmithUse {any}`these citations <doc_index_citing_petsc>`.
299*9b92b1d3SBarry Smith
300*9b92b1d3SBarry Smith______________________________________________________________________
301*9b92b1d3SBarry Smith
302*9b92b1d3SBarry Smith## Installation
303*9b92b1d3SBarry Smith
304*9b92b1d3SBarry Smith### How do I begin using PETSc if the software has already been completely built and installed by someone else?
305*9b92b1d3SBarry Smith
306*9b92b1d3SBarry SmithAssuming that the PETSc libraries have been successfully built for a particular
307*9b92b1d3SBarry Smitharchitecture and level of optimization, a new user must merely:
308*9b92b1d3SBarry Smith
309*9b92b1d3SBarry Smith1. Set `PETSC_DIR` to the full path of the PETSc home
310*9b92b1d3SBarry Smith   directory. This will be the location of the `configure` script, and usually called
311*9b92b1d3SBarry Smith   "petsc" or some variation of that (for example, /home/username/petsc).
312*9b92b1d3SBarry Smith2. Set `PETSC_ARCH`, which indicates the configuration on which PETSc will be
313*9b92b1d3SBarry Smith   used. Note that `$PETSC_ARCH` is simply a name the installer used when installing
314*9b92b1d3SBarry Smith   the libraries. There will exist a directory within `$PETSC_DIR` that is named after
315*9b92b1d3SBarry Smith   its corresponding `$PETSC_ARCH`. There many be several on a single system, for
316*9b92b1d3SBarry Smith   example "linux-c-debug" for the debug versions compiled by a C compiler or
317*9b92b1d3SBarry Smith   "linux-c-opt" for the optimized version.
318*9b92b1d3SBarry Smith
319*9b92b1d3SBarry Smith:::{admonition} Still Stuck?
320*9b92b1d3SBarry SmithSee the {ref}`quick-start tutorial <tut_install>` for a step-by-step guide on
321*9b92b1d3SBarry Smithinstalling PETSc, in case you have missed a step.
322*9b92b1d3SBarry Smith
323*9b92b1d3SBarry SmithSee the users manual section on {ref}`getting started <sec-getting-started>`.
324*9b92b1d3SBarry Smith:::
325*9b92b1d3SBarry Smith
326*9b92b1d3SBarry Smith### The PETSc distribution is SO Large. How can I reduce my disk space usage?
327*9b92b1d3SBarry Smith
328*9b92b1d3SBarry Smith1. The PETSc users manual is provided in PDF format at `$PETSC_DIR/manual.pdf`. You
329*9b92b1d3SBarry Smith   can delete this.
330*9b92b1d3SBarry Smith2. The PETSc test suite contains sample output for many of the examples. These are
331*9b92b1d3SBarry Smith   contained in the PETSc directories `$PETSC_DIR/src/*/tutorials/output` and
332*9b92b1d3SBarry Smith   `$PETSC_DIR/src/*/tests/output`. Once you have run the test examples, you may remove
333*9b92b1d3SBarry Smith   all of these directories to save some disk space. You can locate the largest with
334*9b92b1d3SBarry Smith   e.g. `find . -name output -type d | xargs du -sh | sort -hr` on a Unix-based system.
335*9b92b1d3SBarry Smith3. The debugging versions of the libraries are larger than the optimized versions. In a
336*9b92b1d3SBarry Smith   pinch you can work with the optimized version, although we bid you good luck in
337*9b92b1d3SBarry Smith   finding bugs as it is much easier with the debug version.
338*9b92b1d3SBarry Smith
339*9b92b1d3SBarry Smith### I want to use PETSc only for uniprocessor programs. Must I still install and use a version of MPI?
340*9b92b1d3SBarry Smith
341*9b92b1d3SBarry SmithNo, run `configure` with the option `--with-mpi=0`
342*9b92b1d3SBarry Smith
343*9b92b1d3SBarry Smith### Can I install PETSc to not use X windows (either under Unix or Microsoft Windows with GCC)?
344*9b92b1d3SBarry Smith
345*9b92b1d3SBarry SmithYes. Run `configure` with the additional flag `--with-x=0`
346*9b92b1d3SBarry Smith
347*9b92b1d3SBarry Smith### Why do you use MPI?
348*9b92b1d3SBarry Smith
349*9b92b1d3SBarry SmithMPI is the message-passing standard. Because it is a standard, it will not frequently change over
350*9b92b1d3SBarry Smithtime; thus, we do not have to change PETSc every time the provider of the message-passing
351*9b92b1d3SBarry Smithsystem decides to make an interface change. MPI was carefully designed by experts from
352*9b92b1d3SBarry Smithindustry, academia, and government labs to provide the highest quality performance and
353*9b92b1d3SBarry Smithcapability.
354*9b92b1d3SBarry Smith
355*9b92b1d3SBarry SmithFor example, the careful design of communicators in MPI allows the easy nesting of
356*9b92b1d3SBarry Smithdifferent libraries; no other message-passing system provides this support. All of the
357*9b92b1d3SBarry Smithmajor parallel computer vendors were involved in the design of MPI and have committed to
358*9b92b1d3SBarry Smithproviding quality implementations.
359*9b92b1d3SBarry Smith
360*9b92b1d3SBarry SmithIn addition, since MPI is a standard, several different groups have already provided
361*9b92b1d3SBarry Smithcomplete free implementations. Thus, one does not have to rely on the technical skills of
362*9b92b1d3SBarry Smithone particular group to provide the message-passing libraries. Today, MPI is the only
363*9b92b1d3SBarry Smithpractical, portable approach to writing efficient parallel numerical software.
364*9b92b1d3SBarry Smith
365*9b92b1d3SBarry Smith(invalid_mpi_compilers)=
366*9b92b1d3SBarry Smith
367*9b92b1d3SBarry Smith### What do I do if my MPI compiler wrappers are invalid?
368*9b92b1d3SBarry Smith
369*9b92b1d3SBarry SmithMost MPI implementations provide compiler wrappers (such as `mpicc`) which give the
370*9b92b1d3SBarry Smithinclude and link options necessary to use that version of MPI to the underlying compilers
371*9b92b1d3SBarry Smith. Configuration will fail if these wrappers are either absent or broken in the MPI pointed to by
372*9b92b1d3SBarry Smith`--with-mpi-dir`. You can rerun the configure with the additional option
373*9b92b1d3SBarry Smith`--with-mpi-compilers=0`, which will try to auto-detect working compilers; however,
374*9b92b1d3SBarry Smiththese compilers may be incompatible with the particular MPI build. If this fix does not
375*9b92b1d3SBarry Smithwork, run with `--with-cc=[your_c_compiler]` where you know `your_c_compiler` works
376*9b92b1d3SBarry Smithwith this particular MPI, and likewise for C++ (`--with-cxx=[your_cxx_compiler]`) and Fortran
377*9b92b1d3SBarry Smith(`--with-fc=[your_fortran_compiler]`).
378*9b92b1d3SBarry Smith
379*9b92b1d3SBarry Smith(bit_indices)=
380*9b92b1d3SBarry Smith
381*9b92b1d3SBarry Smith### When should/can I use the `configure` option `--with-64-bit-indices`?
382*9b92b1d3SBarry Smith
383*9b92b1d3SBarry SmithBy default the type that PETSc uses to index into arrays and keep sizes of arrays is a
384*9b92b1d3SBarry Smith`PetscInt` defined to be a 32-bit `int`. If your problem:
385*9b92b1d3SBarry Smith
386*9b92b1d3SBarry Smith- Involves more than 2^31 - 1 unknowns (around 2 billion).
387*9b92b1d3SBarry Smith- Your matrix might contain more than 2^31 - 1 nonzeros on a single process.
388*9b92b1d3SBarry Smith
389*9b92b1d3SBarry SmithThen you need to use this option. Otherwise you will get strange crashes.
390*9b92b1d3SBarry Smith
391*9b92b1d3SBarry SmithThis option can be used when you are using either 32 or 64-bit pointers. You do not
392*9b92b1d3SBarry Smithneed to use this option if you are using 64-bit pointers unless the two conditions above
393*9b92b1d3SBarry Smithhold.
394*9b92b1d3SBarry Smith
395*9b92b1d3SBarry Smith### What if I get an internal compiler error?
396*9b92b1d3SBarry Smith
397*9b92b1d3SBarry SmithYou can rebuild the offending file individually with a lower optimization level. **Then
398*9b92b1d3SBarry Smithmake sure to complain to the compiler vendor and file a bug report**. For example, if the
399*9b92b1d3SBarry Smithcompiler chokes on `src/mat/impls/baij/seq/baijsolvtrannat.c` you can run the following
400*9b92b1d3SBarry Smithfrom `$PETSC_DIR`:
401*9b92b1d3SBarry Smith
402*9b92b1d3SBarry Smith```console
403*9b92b1d3SBarry Smith$ make -f gmakefile PCC_FLAGS="-O1" $PETSC_ARCH/obj/src/mat/impls/baij/seq/baijsolvtrannat.o
404*9b92b1d3SBarry Smith$ make all
405*9b92b1d3SBarry Smith```
406*9b92b1d3SBarry Smith
407*9b92b1d3SBarry Smith### How do I enable Python bindings (petsc4py) with PETSc?
408*9b92b1d3SBarry Smith
409*9b92b1d3SBarry Smith1. Install [Cython](https://cython.org/).
410*9b92b1d3SBarry Smith2. `configure` PETSc with the `--with-petsc4py=1` option.
411*9b92b1d3SBarry Smith3. set `PYTHONPATH=$PETSC_DIR/$PETSC_ARCH/lib`
412*9b92b1d3SBarry Smith
413*9b92b1d3SBarry Smith(macos_gfortran)=
414*9b92b1d3SBarry Smith
415*9b92b1d3SBarry Smith### What Fortran compiler do you recommend on macOS?
416*9b92b1d3SBarry Smith
417*9b92b1d3SBarry SmithWe recommend using [homebrew](https://brew.sh/) to install [gfortran](https://gcc.gnu.org/wiki/GFortran), see {any}`doc_macos_install`
418*9b92b1d3SBarry Smith
419*9b92b1d3SBarry Smith### How can I find the URL locations of the packages you install using `--download-PACKAGE`?
420*9b92b1d3SBarry Smith
421*9b92b1d3SBarry Smith```console
422*9b92b1d3SBarry Smith$ grep "self.download " $PETSC_DIR/config/BuildSystem/config/packages/*.py
423*9b92b1d3SBarry Smith```
424*9b92b1d3SBarry Smith
425*9b92b1d3SBarry Smith### How to fix the problem: PETSc was configured with one MPICH (or Open MPI) `mpi.h` version but now appears to be compiling using a different MPICH (or Open MPI) `mpi.h` version
426*9b92b1d3SBarry Smith
427*9b92b1d3SBarry SmithThis happens for generally one of two reasons:
428*9b92b1d3SBarry Smith
429*9b92b1d3SBarry Smith- You previously ran `configure` with the option `--download-mpich` (or `--download-openmpi`)
430*9b92b1d3SBarry Smith  but later ran `configure` to use a version of MPI already installed on the
431*9b92b1d3SBarry Smith  machine. Solution:
432*9b92b1d3SBarry Smith
433*9b92b1d3SBarry Smith  ```console
434*9b92b1d3SBarry Smith  $ rm -rf $PETSC_DIR/$PETSC_ARCH
435*9b92b1d3SBarry Smith  $ ./configure --your-args
436*9b92b1d3SBarry Smith  ```
437*9b92b1d3SBarry Smith
438*9b92b1d3SBarry Smith(mpi_network_misconfigure)=
439*9b92b1d3SBarry Smith
440*9b92b1d3SBarry Smith### What does it mean when `make check` hangs or errors on `PetscOptionsInsertFile()`?
441*9b92b1d3SBarry Smith
442*9b92b1d3SBarry SmithFor example:
443*9b92b1d3SBarry Smith
444*9b92b1d3SBarry Smith```none
445*9b92b1d3SBarry SmithPossible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes
446*9b92b1d3SBarry SmithSee https://petsc.org/release/faq/
447*9b92b1d3SBarry Smith[0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
448*9b92b1d3SBarry Smith[0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
449*9b92b1d3SBarry Smith[0]PETSC ERROR: #3 PetscInitialize() line 828 in /Users/barrysmith/Src/PETSc/src/sys/objects/pinit.c
450*9b92b1d3SBarry Smith```
451*9b92b1d3SBarry Smith
452*9b92b1d3SBarry Smithor
453*9b92b1d3SBarry Smith
454*9b92b1d3SBarry Smith```none
455*9b92b1d3SBarry Smith$ make check
456*9b92b1d3SBarry SmithRunning check examples to verify correct installation
457*9b92b1d3SBarry SmithUsing PETSC_DIR=/Users/barrysmith/Src/petsc and PETSC_ARCH=arch-fix-mpiexec-hang-2-ranks
458*9b92b1d3SBarry SmithC/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
459*9b92b1d3SBarry SmithPROGRAM SEEMS TO BE HANGING HERE
460*9b92b1d3SBarry Smith```
461*9b92b1d3SBarry Smith
462*9b92b1d3SBarry SmithThis usually occurs when network settings are misconfigured (perhaps due to VPN) resulting in a failure or hang in system call `gethostbyname()`.
463*9b92b1d3SBarry Smith
464*9b92b1d3SBarry Smith- Verify you are using the correct `mpiexec` for the MPI you have linked PETSc with.
465*9b92b1d3SBarry Smith
466*9b92b1d3SBarry Smith- If you have a VPN enabled on your machine, try turning it off and then running `make check` to
467*9b92b1d3SBarry Smith  verify that it is not the VPN playing poorly with MPI.
468*9b92b1d3SBarry Smith
469*9b92b1d3SBarry Smith- If ``` ping `hostname` `` ( ```/sbin/ping\`\` on macOS) fails or hangs do:
470*9b92b1d3SBarry Smith
471*9b92b1d3SBarry Smith  ```none
472*9b92b1d3SBarry Smith  echo 127.0.0.1 `hostname` | sudo tee -a /etc/hosts
473*9b92b1d3SBarry Smith  ```
474*9b92b1d3SBarry Smith
475*9b92b1d3SBarry Smith  and try `make check` again.
476*9b92b1d3SBarry Smith
477*9b92b1d3SBarry Smith- Try completely disconnecting your machine from the network and see if `make check` then works
478*9b92b1d3SBarry Smith
479*9b92b1d3SBarry Smith- Try the PETSc `configure` option `--download-mpich-device=ch3:nemesis` with `--download-mpich`.
480*9b92b1d3SBarry Smith
481*9b92b1d3SBarry Smith______________________________________________________________________
482*9b92b1d3SBarry Smith
483*9b92b1d3SBarry Smith## Usage
484*9b92b1d3SBarry Smith
485*9b92b1d3SBarry Smith### How can I redirect PETSc's `stdout` and `stderr` when programming with a GUI interface in Windows Developer Studio or to C++ streams?
486*9b92b1d3SBarry Smith
487*9b92b1d3SBarry SmithTo overload just the error messages write your own `MyPrintError()` function that does
488*9b92b1d3SBarry Smithwhatever you want (including pop up windows etc) and use it like below.
489*9b92b1d3SBarry Smith
490*9b92b1d3SBarry Smith```c
491*9b92b1d3SBarry Smithextern "C" {
492*9b92b1d3SBarry Smith  int PASCAL WinMain(HINSTANCE,HINSTANCE,LPSTR,int);
493*9b92b1d3SBarry Smith};
494*9b92b1d3SBarry Smith
495*9b92b1d3SBarry Smith#include <petscsys.h>
496*9b92b1d3SBarry Smith#include <mpi.h>
497*9b92b1d3SBarry Smith
498*9b92b1d3SBarry Smithconst char help[] = "Set up from main";
499*9b92b1d3SBarry Smith
500*9b92b1d3SBarry Smithint MyPrintError(const char error[], ...)
501*9b92b1d3SBarry Smith{
502*9b92b1d3SBarry Smith  printf("%s", error);
503*9b92b1d3SBarry Smith  return 0;
504*9b92b1d3SBarry Smith}
505*9b92b1d3SBarry Smith
506*9b92b1d3SBarry Smithint main(int ac, char *av[])
507*9b92b1d3SBarry Smith{
508*9b92b1d3SBarry Smith  char           buf[256];
509*9b92b1d3SBarry Smith  HINSTANCE      inst;
510*9b92b1d3SBarry Smith  PetscErrorCode ierr;
511*9b92b1d3SBarry Smith
512*9b92b1d3SBarry Smith  inst = (HINSTANCE)GetModuleHandle(NULL);
513*9b92b1d3SBarry Smith  PetscErrorPrintf = MyPrintError;
514*9b92b1d3SBarry Smith
515*9b92b1d3SBarry Smith  buf[0] = 0;
516*9b92b1d3SBarry Smith  for (int i = 1; i < ac; ++i) {
517*9b92b1d3SBarry Smith    strcat(buf, av[i]);
518*9b92b1d3SBarry Smith    strcat(buf, " ");
519*9b92b1d3SBarry Smith  }
520*9b92b1d3SBarry Smith
521*9b92b1d3SBarry Smith  PetscCall(PetscInitialize(&ac, &av, NULL, help));
522*9b92b1d3SBarry Smith
523*9b92b1d3SBarry Smith  return WinMain(inst, NULL, buf, SW_SHOWNORMAL);
524*9b92b1d3SBarry Smith}
525*9b92b1d3SBarry Smith```
526*9b92b1d3SBarry Smith
527*9b92b1d3SBarry SmithPlace this file in the project and compile with this preprocessor definitions:
528*9b92b1d3SBarry Smith
529*9b92b1d3SBarry Smith```
530*9b92b1d3SBarry SmithWIN32
531*9b92b1d3SBarry Smith_DEBUG
532*9b92b1d3SBarry Smith_CONSOLE
533*9b92b1d3SBarry Smith_MBCS
534*9b92b1d3SBarry SmithUSE_PETSC_LOG
535*9b92b1d3SBarry SmithUSE_PETSC_BOPT_g
536*9b92b1d3SBarry SmithUSE_PETSC_STACK
537*9b92b1d3SBarry Smith_AFXDLL
538*9b92b1d3SBarry Smith```
539*9b92b1d3SBarry Smith
540*9b92b1d3SBarry SmithAnd these link options:
541*9b92b1d3SBarry Smith
542*9b92b1d3SBarry Smith```
543*9b92b1d3SBarry Smith/nologo
544*9b92b1d3SBarry Smith/subsystem:console
545*9b92b1d3SBarry Smith/incremental:yes
546*9b92b1d3SBarry Smith/debug
547*9b92b1d3SBarry Smith/machine:I386
548*9b92b1d3SBarry Smith/nodefaultlib:"libcmtd.lib"
549*9b92b1d3SBarry Smith/nodefaultlib:"libcd.lib"
550*9b92b1d3SBarry Smith/nodefaultlib:"mvcrt.lib"
551*9b92b1d3SBarry Smith/pdbtype:sept
552*9b92b1d3SBarry Smith```
553*9b92b1d3SBarry Smith
554*9b92b1d3SBarry Smith:::{note}
555*9b92b1d3SBarry SmithThe above is compiled and linked as if it was a console program. The linker will search
556*9b92b1d3SBarry Smithfor a main, and then from it the `WinMain` will start. This works with MFC templates and
557*9b92b1d3SBarry Smithderived classes too.
558*9b92b1d3SBarry Smith
559*9b92b1d3SBarry SmithWhen writing a Window's console application you do not need to do anything, the `stdout`
560*9b92b1d3SBarry Smithand `stderr` is automatically output to the console window.
561*9b92b1d3SBarry Smith:::
562*9b92b1d3SBarry Smith
563*9b92b1d3SBarry SmithTo change where all PETSc `stdout` and `stderr` go, (you can also reassign
564*9b92b1d3SBarry Smith`PetscVFPrintf()` to handle `stdout` and `stderr` any way you like) write the
565*9b92b1d3SBarry Smithfollowing function:
566*9b92b1d3SBarry Smith
567*9b92b1d3SBarry Smith```
568*9b92b1d3SBarry SmithPetscErrorCode mypetscvfprintf(FILE *fd, const char format[], va_list Argp)
569*9b92b1d3SBarry Smith{
570*9b92b1d3SBarry Smith  PetscFunctionBegin;
571*9b92b1d3SBarry Smith  if (fd != stdout && fd != stderr) { /* handle regular files */
572*9b92b1d3SBarry Smith    PetscCall(PetscVFPrintfDefault(fd, format, Argp));
573*9b92b1d3SBarry Smith  } else {
574*9b92b1d3SBarry Smith    char buff[1024]; /* Make sure to assign a large enough buffer */
575*9b92b1d3SBarry Smith    int  length;
576*9b92b1d3SBarry Smith
577*9b92b1d3SBarry Smith    PetscCall(PetscVSNPrintf(buff, 1024, format, &length, Argp));
578*9b92b1d3SBarry Smith
579*9b92b1d3SBarry Smith    /* now send buff to whatever stream or whatever you want */
580*9b92b1d3SBarry Smith  }
581*9b92b1d3SBarry Smith  PetscFunctionReturn(PETSC_SUCCESS);
582*9b92b1d3SBarry Smith}
583*9b92b1d3SBarry Smith```
584*9b92b1d3SBarry Smith
585*9b92b1d3SBarry SmithThen assign `PetscVFPrintf = mypetscprintf` before `PetscInitialize()` in your main
586*9b92b1d3SBarry Smithprogram.
587*9b92b1d3SBarry Smith
588*9b92b1d3SBarry Smith### I want to use Hypre boomerAMG without GMRES but when I run `-pc_type hypre -pc_hypre_type boomeramg -ksp_type preonly` I don't get a very accurate answer!
589*9b92b1d3SBarry Smith
590*9b92b1d3SBarry SmithYou should run with `-ksp_type richardson` to have PETSc run several V or W
591*9b92b1d3SBarry Smithcycles. `-ksp_type preonly` causes boomerAMG to use only one V/W cycle. You can control
592*9b92b1d3SBarry Smithhow many cycles are used in a single application of the boomerAMG preconditioner with
593*9b92b1d3SBarry Smith`-pc_hypre_boomeramg_max_iter <it>` (the default is 1). You can also control the
594*9b92b1d3SBarry Smithtolerance boomerAMG uses to decide if to stop before `max_iter` with
595*9b92b1d3SBarry Smith`-pc_hypre_boomeramg_tol <tol>` (the default is 1.e-7). Run with `-ksp_view` to see
596*9b92b1d3SBarry Smithall the hypre options used and `-help | grep boomeramg` to see all the command line
597*9b92b1d3SBarry Smithoptions.
598*9b92b1d3SBarry Smith
599*9b92b1d3SBarry Smith### How do I use PETSc for Domain Decomposition?
600*9b92b1d3SBarry Smith
601*9b92b1d3SBarry SmithPETSc includes Additive Schwarz methods in the suite of preconditioners under the umbrella
602*9b92b1d3SBarry Smithof `PCASM`. These may be activated with the runtime option `-pc_type asm`. Various
603*9b92b1d3SBarry Smithother options may be set, including the degree of overlap `-pc_asm_overlap <number>` the
604*9b92b1d3SBarry Smithtype of restriction/extension `-pc_asm_type [basic,restrict,interpolate,none]` sets ASM
605*9b92b1d3SBarry Smithtype and several others. You may see the available ASM options by using `-pc_type asm
606*9b92b1d3SBarry Smith-help`. See the procedural interfaces in the manual pages, for example `PCASMType()`
607*9b92b1d3SBarry Smithand check the index of the users manual for `PCASMCreateSubdomains()`.
608*9b92b1d3SBarry Smith
609*9b92b1d3SBarry SmithPETSc also contains a domain decomposition inspired wirebasket or face based two level
610*9b92b1d3SBarry Smithmethod where the coarse mesh to fine mesh interpolation is defined by solving specific
611*9b92b1d3SBarry Smithlocal subdomain problems. It currently only works for 3D scalar problems on structured
612*9b92b1d3SBarry Smithgrids created with PETSc `DMDA`. See the manual page for `PCEXOTIC` and
613*9b92b1d3SBarry Smith`src/ksp/ksp/tutorials/ex45.c` for an example.
614*9b92b1d3SBarry Smith
615*9b92b1d3SBarry SmithPETSc also contains a balancing Neumann-Neumann type preconditioner, see the manual page
616*9b92b1d3SBarry Smithfor `PCBDDC`. This requires matrices be constructed with `MatCreateIS()` via the finite
617*9b92b1d3SBarry Smithelement method. See `src/ksp/ksp/tests/ex59.c` for an example.
618*9b92b1d3SBarry Smith
619*9b92b1d3SBarry Smith### You have `MATAIJ` and `MATBAIJ` matrix formats, and `MATSBAIJ` for symmetric storage, how come no `MATSAIJ`?
620*9b92b1d3SBarry Smith
621*9b92b1d3SBarry SmithJust for historical reasons; the `MATSBAIJ` format with blocksize one is just as efficient as
622*9b92b1d3SBarry Smitha `MATSAIJ` would be.
623*9b92b1d3SBarry Smith
624*9b92b1d3SBarry Smith### Can I Create BAIJ matrices with different size blocks for different block rows?
625*9b92b1d3SBarry Smith
626*9b92b1d3SBarry SmithNo. The `MATBAIJ` format only supports a single fixed block size on the entire
627*9b92b1d3SBarry Smithmatrix. But the `MATAIJ` format automatically searches for matching rows and thus still
628*9b92b1d3SBarry Smithtakes advantage of the natural blocks in your matrix to obtain good performance.
629*9b92b1d3SBarry Smith
630*9b92b1d3SBarry Smith:::{note}
631*9b92b1d3SBarry SmithIf you use `MATAIJ` you cannot use the `MatSetValuesBlocked()`.
632*9b92b1d3SBarry Smith:::
633*9b92b1d3SBarry Smith
634*9b92b1d3SBarry Smith### How do I access the values of a remote parallel PETSc Vec?
635*9b92b1d3SBarry Smith
636*9b92b1d3SBarry Smith1. On each process create a local `Vec` large enough to hold all the values it wishes to
637*9b92b1d3SBarry Smith   access.
638*9b92b1d3SBarry Smith2. Create a `VecScatter` that scatters from the parallel `Vec` into the local `Vec`.
639*9b92b1d3SBarry Smith3. Use `VecGetArray()` to access the values in the local `Vec`.
640*9b92b1d3SBarry Smith
641*9b92b1d3SBarry SmithFor example, assuming we have distributed a vector `vecGlobal` of size $N$ to
642*9b92b1d3SBarry Smith$R$ ranks and each remote rank holds $N/R = m$ values (similarly assume that
643*9b92b1d3SBarry Smith$N$ is cleanly divisible by $R$). We want each rank $r$ to gather the
644*9b92b1d3SBarry Smithfirst $n$ (also assume $n \leq m$) values from its immediately superior neighbor
645*9b92b1d3SBarry Smith$r+1$ (final rank will retrieve from rank 0).
646*9b92b1d3SBarry Smith
647*9b92b1d3SBarry Smith```
648*9b92b1d3SBarry SmithVec            vecLocal;
649*9b92b1d3SBarry SmithIS             isLocal, isGlobal;
650*9b92b1d3SBarry SmithVecScatter     ctx;
651*9b92b1d3SBarry SmithPetscScalar    *arr;
652*9b92b1d3SBarry SmithPetscInt       N, firstGlobalIndex;
653*9b92b1d3SBarry SmithMPI_Comm       comm;
654*9b92b1d3SBarry SmithPetscMPIInt    r, R;
655*9b92b1d3SBarry Smith
656*9b92b1d3SBarry Smith/* Create sequential local vector, big enough to hold local portion */
657*9b92b1d3SBarry SmithPetscCall(VecCreateSeq(PETSC_COMM_SELF, n, &vecLocal));
658*9b92b1d3SBarry Smith
659*9b92b1d3SBarry Smith/* Create IS to describe where we want to scatter to */
660*9b92b1d3SBarry SmithPetscCall(ISCreateStride(PETSC_COMM_SELF, n, 0, 1, &isLocal));
661*9b92b1d3SBarry Smith
662*9b92b1d3SBarry Smith/* Compute the global indices */
663*9b92b1d3SBarry SmithPetscCall(VecGetSize(vecGlobal, &N));
664*9b92b1d3SBarry SmithPetscCall(PetscObjectGetComm((PetscObject) vecGlobal, &comm));
665*9b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_rank(comm, &r));
666*9b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_size(comm, &R));
667*9b92b1d3SBarry SmithfirstGlobalIndex = r == R-1 ? 0 : (N/R)*(r+1);
668*9b92b1d3SBarry Smith
669*9b92b1d3SBarry Smith/* Create IS that describes where we want to scatter from */
670*9b92b1d3SBarry SmithPetscCall(ISCreateStride(comm, n, firstGlobalIndex, 1, &isGlobal));
671*9b92b1d3SBarry Smith
672*9b92b1d3SBarry Smith/* Create the VecScatter context */
673*9b92b1d3SBarry SmithPetscCall(VecScatterCreate(vecGlobal, isGlobal, vecLocal, isLocal, &ctx));
674*9b92b1d3SBarry Smith
675*9b92b1d3SBarry Smith/* Gather the values */
676*9b92b1d3SBarry SmithPetscCall(VecScatterBegin(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD));
677*9b92b1d3SBarry SmithPetscCall(VecScatterEnd(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD));
678*9b92b1d3SBarry Smith
679*9b92b1d3SBarry Smith/* Retrieve and do work */
680*9b92b1d3SBarry SmithPetscCall(VecGetArray(vecLocal, &arr));
681*9b92b1d3SBarry Smith/* Work */
682*9b92b1d3SBarry SmithPetscCall(VecRestoreArray(vecLocal, &arr));
683*9b92b1d3SBarry Smith
684*9b92b1d3SBarry Smith/* Don't forget to clean up */
685*9b92b1d3SBarry SmithPetscCall(ISDestroy(&isLocal));
686*9b92b1d3SBarry SmithPetscCall(ISDestroy(&isGlobal));
687*9b92b1d3SBarry SmithPetscCall(VecScatterDestroy(&ctx));
688*9b92b1d3SBarry SmithPetscCall(VecDestroy(&vecLocal));
689*9b92b1d3SBarry Smith```
690*9b92b1d3SBarry Smith
691*9b92b1d3SBarry Smith(doc_faq_usage_alltoone)=
692*9b92b1d3SBarry Smith
693*9b92b1d3SBarry Smith### How do I collect to a single processor all the values from a parallel PETSc Vec?
694*9b92b1d3SBarry Smith
695*9b92b1d3SBarry Smith1. Create the `VecScatter` context that will do the communication:
696*9b92b1d3SBarry Smith
697*9b92b1d3SBarry Smith   ```
698*9b92b1d3SBarry Smith   Vec        in_par,out_seq;
699*9b92b1d3SBarry Smith   VecScatter ctx;
700*9b92b1d3SBarry Smith
701*9b92b1d3SBarry Smith   PetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq));
702*9b92b1d3SBarry Smith   ```
703*9b92b1d3SBarry Smith
704*9b92b1d3SBarry Smith2. Initiate the communication (this may be repeated if you wish):
705*9b92b1d3SBarry Smith
706*9b92b1d3SBarry Smith   ```
707*9b92b1d3SBarry Smith   PetscCall(VecScatterBegin(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD));
708*9b92b1d3SBarry Smith   PetscCall(VecScatterEnd(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD));
709*9b92b1d3SBarry Smith   /* May destroy context now if no additional scatters are needed, otherwise reuse it */
710*9b92b1d3SBarry Smith   PetscCall(VecScatterDestroy(&ctx));
711*9b92b1d3SBarry Smith   ```
712*9b92b1d3SBarry Smith
713*9b92b1d3SBarry SmithNote that this simply concatenates in the parallel ordering of the vector (computed by the
714*9b92b1d3SBarry Smith`MPI_Comm_rank` of the owning process). If you are using a `Vec` from
715*9b92b1d3SBarry Smith`DMCreateGlobalVector()` you likely want to first call `DMDAGlobalToNaturalBegin()`
716*9b92b1d3SBarry Smithfollowed by `DMDAGlobalToNaturalEnd()` to scatter the original `Vec` into the natural
717*9b92b1d3SBarry Smithordering in a new global `Vec` before calling `VecScatterBegin()`/`VecScatterEnd()`
718*9b92b1d3SBarry Smithto scatter the natural `Vec` onto all processes.
719*9b92b1d3SBarry Smith
720*9b92b1d3SBarry Smith### How do I collect all the values from a parallel PETSc Vec on the 0th rank?
721*9b92b1d3SBarry Smith
722*9b92b1d3SBarry SmithSee FAQ entry on collecting to {ref}`an arbitrary processor <doc_faq_usage_alltoone>`, but
723*9b92b1d3SBarry Smithreplace
724*9b92b1d3SBarry Smith
725*9b92b1d3SBarry Smith```
726*9b92b1d3SBarry SmithPetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq));
727*9b92b1d3SBarry Smith```
728*9b92b1d3SBarry Smith
729*9b92b1d3SBarry Smithwith
730*9b92b1d3SBarry Smith
731*9b92b1d3SBarry Smith```
732*9b92b1d3SBarry SmithPetscCall(VecScatterCreateToZero(in_par, &ctx, &out_seq));
733*9b92b1d3SBarry Smith```
734*9b92b1d3SBarry Smith
735*9b92b1d3SBarry Smith:::{note}
736*9b92b1d3SBarry SmithThe same ordering considerations as discussed in the aforementioned entry also apply
737*9b92b1d3SBarry Smithhere.
738*9b92b1d3SBarry Smith:::
739*9b92b1d3SBarry Smith
740*9b92b1d3SBarry Smith### How can I read in or write out a sparse matrix in Matrix Market, Harwell-Boeing, Slapc or other ASCII format?
741*9b92b1d3SBarry Smith
742*9b92b1d3SBarry SmithIf you can read or write your matrix using Python or MATLAB/Octave, `PetscBinaryIO`
743*9b92b1d3SBarry Smithmodules are provided at `$PETSC_DIR/lib/petsc/bin` for each language that can assist
744*9b92b1d3SBarry Smithwith reading and writing. If you just want to convert `MatrixMarket`, you can use:
745*9b92b1d3SBarry Smith
746*9b92b1d3SBarry Smith```console
747*9b92b1d3SBarry Smith$ python -m $PETSC_DIR/lib/petsc/bin/PetscBinaryIO convert matrix.mtx
748*9b92b1d3SBarry Smith```
749*9b92b1d3SBarry Smith
750*9b92b1d3SBarry SmithTo produce `matrix.petsc`.
751*9b92b1d3SBarry Smith
752*9b92b1d3SBarry SmithYou can also call the script directly or import it from your Python code. There is also a
753*9b92b1d3SBarry Smith`PETScBinaryIO.jl` Julia package.
754*9b92b1d3SBarry Smith
755*9b92b1d3SBarry SmithFor other formats, either adapt one of the above libraries or see the examples in
756*9b92b1d3SBarry Smith`$PETSC_DIR/src/mat/tests`, specifically `ex72.c` or `ex78.c`. You will likely need
757*9b92b1d3SBarry Smithto modify the code slightly to match your required ASCII format.
758*9b92b1d3SBarry Smith
759*9b92b1d3SBarry Smith:::{note}
760*9b92b1d3SBarry SmithNever read or write in parallel an ASCII matrix file.
761*9b92b1d3SBarry Smith
762*9b92b1d3SBarry SmithInstead read in sequentially with a standalone code based on `ex72.c` or `ex78.c`
763*9b92b1d3SBarry Smiththen save the matrix with the binary viewer `PetscViewerBinaryOpen()` and load the
764*9b92b1d3SBarry Smithmatrix in parallel in your "real" PETSc program with `MatLoad()`.
765*9b92b1d3SBarry Smith
766*9b92b1d3SBarry SmithFor writing save with the binary viewer and then load with the sequential code to store
767*9b92b1d3SBarry Smithit as ASCII.
768*9b92b1d3SBarry Smith:::
769*9b92b1d3SBarry Smith
770*9b92b1d3SBarry Smith### Does TSSetFromOptions(), SNESSetFromOptions(), or KSPSetFromOptions() reset all the parameters I previously set or how come do they not seem to work?
771*9b92b1d3SBarry Smith
772*9b92b1d3SBarry SmithIf `XXSetFromOptions()` is used (with `-xxx_type aaaa`) to change the type of the
773*9b92b1d3SBarry Smithobject then all parameters associated with the previous type are removed. Otherwise it
774*9b92b1d3SBarry Smithdoes not reset parameters.
775*9b92b1d3SBarry Smith
776*9b92b1d3SBarry Smith`TS/SNES/KSPSetXXX()` commands that set properties for a particular type of object (such
777*9b92b1d3SBarry Smithas `KSPGMRESSetRestart()`) ONLY work if the object is ALREADY of that type. For example,
778*9b92b1d3SBarry Smithwith
779*9b92b1d3SBarry Smith
780*9b92b1d3SBarry Smith```
781*9b92b1d3SBarry SmithKSP ksp;
782*9b92b1d3SBarry Smith
783*9b92b1d3SBarry SmithPetscCall(KSPCreate(PETSC_COMM_WORLD, &ksp));
784*9b92b1d3SBarry SmithPetscCall(KSPGMRESSetRestart(ksp, 10));
785*9b92b1d3SBarry Smith```
786*9b92b1d3SBarry Smith
787*9b92b1d3SBarry Smiththe restart will be ignored since the type has not yet been set to `KSPGMRES`. To have
788*9b92b1d3SBarry Smiththose values take effect you should do one of the following:
789*9b92b1d3SBarry Smith
790*9b92b1d3SBarry Smith- Allow setting the type from the command line, if it is not on the command line then the
791*9b92b1d3SBarry Smith  default type is automatically set.
792*9b92b1d3SBarry Smith
793*9b92b1d3SBarry Smith```
794*9b92b1d3SBarry Smith/* Create generic object */
795*9b92b1d3SBarry SmithXXXCreate(..,&obj);
796*9b92b1d3SBarry Smith/* Must set all settings here, or default */
797*9b92b1d3SBarry SmithXXXSetFromOptions(obj);
798*9b92b1d3SBarry Smith```
799*9b92b1d3SBarry Smith
800*9b92b1d3SBarry Smith- Hardwire the type in the code, but allow the user to override it via a subsequent
801*9b92b1d3SBarry Smith  `XXXSetFromOptions()` call. This essentially allows the user to customize what the
802*9b92b1d3SBarry Smith  "default" type to of the object.
803*9b92b1d3SBarry Smith
804*9b92b1d3SBarry Smith```
805*9b92b1d3SBarry Smith/* Create generic object */
806*9b92b1d3SBarry SmithXXXCreate(..,&obj);
807*9b92b1d3SBarry Smith/* Set type directly */
808*9b92b1d3SBarry SmithXXXSetYYYYY(obj,...);
809*9b92b1d3SBarry Smith/* Can always change to different type */
810*9b92b1d3SBarry SmithXXXSetFromOptions(obj);
811*9b92b1d3SBarry Smith```
812*9b92b1d3SBarry Smith
813*9b92b1d3SBarry Smith### How do I compile and link my own PETSc application codes and can I use my own `makefile` or rules for compiling code, rather than PETSc's?
814*9b92b1d3SBarry Smith
815*9b92b1d3SBarry SmithSee the {ref}`section <sec_writing_application_codes>` of the users manual on writing
816*9b92b1d3SBarry Smithapplication codes with PETSc.
817*9b92b1d3SBarry Smith
818*9b92b1d3SBarry Smith### Can I use CMake to build my own project that depends on PETSc?
819*9b92b1d3SBarry Smith
820*9b92b1d3SBarry SmithSee the {ref}`section <sec_writing_application_codes>` of the users manual on writing
821*9b92b1d3SBarry Smithapplication codes with PETSc.
822*9b92b1d3SBarry Smith
823*9b92b1d3SBarry Smith### How can I put carriage returns in `PetscPrintf()` statements from Fortran?
824*9b92b1d3SBarry Smith
825*9b92b1d3SBarry SmithYou can use the same notation as in C, just put a `\n` in the string. Note that no other C
826*9b92b1d3SBarry Smithformat instruction is supported.
827*9b92b1d3SBarry Smith
828*9b92b1d3SBarry SmithOr you can use the Fortran concatenation `//` and `char(10)`; for example `'some
829*9b92b1d3SBarry Smithstring'//char(10)//'another string` on the next line.
830*9b92b1d3SBarry Smith
831*9b92b1d3SBarry Smith### How can I implement callbacks using C++ class methods?
832*9b92b1d3SBarry Smith
833*9b92b1d3SBarry SmithDeclare the class method static. Static methods do not have a `this` pointer, but the
834*9b92b1d3SBarry Smith`void*` context parameter will usually be cast to a pointer to the class where it can
835*9b92b1d3SBarry Smithserve the same function.
836*9b92b1d3SBarry Smith
837*9b92b1d3SBarry Smith:::{admonition} Remember
838*9b92b1d3SBarry SmithAll PETSc callbacks return `PetscErrorCode`.
839*9b92b1d3SBarry Smith:::
840*9b92b1d3SBarry Smith
841*9b92b1d3SBarry Smith### Everyone knows that when you code Newton's Method you should compute the function and its Jacobian at the same time. How can one do this in PETSc?
842*9b92b1d3SBarry Smith
843*9b92b1d3SBarry SmithThe update in Newton's method is computed as
844*9b92b1d3SBarry Smith
845*9b92b1d3SBarry Smith$$
846*9b92b1d3SBarry Smithu^{n+1} = u^n - \lambda * \left[J(u^n)] * F(u^n) \right]^{\dagger}
847*9b92b1d3SBarry Smith$$
848*9b92b1d3SBarry Smith
849*9b92b1d3SBarry SmithThe reason PETSc doesn't default to computing both the function and Jacobian at the same
850*9b92b1d3SBarry Smithtime is:
851*9b92b1d3SBarry Smith
852*9b92b1d3SBarry Smith- In order to do the line search $F \left(u^n - \lambda * \text{step} \right)$ may
853*9b92b1d3SBarry Smith  need to be computed for several $\lambda$. The Jacobian is not needed for each of
854*9b92b1d3SBarry Smith  those and one does not know in advance which will be the final $\lambda$ until
855*9b92b1d3SBarry Smith  after the function value is computed, so many extra Jacobians may be computed.
856*9b92b1d3SBarry Smith- In the final step if $|| F(u^p)||$ satisfies the convergence criteria then a
857*9b92b1d3SBarry Smith  Jacobian need not be computed.
858*9b92b1d3SBarry Smith
859*9b92b1d3SBarry SmithYou are free to have your `FormFunction()` compute as much of the Jacobian at that point
860*9b92b1d3SBarry Smithas you like, keep the information in the user context (the final argument to
861*9b92b1d3SBarry Smith`FormFunction()` and `FormJacobian()`) and then retrieve the information in your
862*9b92b1d3SBarry Smith`FormJacobian()` function.
863*9b92b1d3SBarry Smith
864*9b92b1d3SBarry Smith### Computing the Jacobian or preconditioner is time consuming. Is there any way to compute it less often?
865*9b92b1d3SBarry Smith
866*9b92b1d3SBarry SmithPETSc has a variety of ways of lagging the computation of the Jacobian or the
867*9b92b1d3SBarry Smithpreconditioner. They are documented in the manual page for `SNESComputeJacobian()`
868*9b92b1d3SBarry Smithand in the {ref}`users manual <ch_snes>`:
869*9b92b1d3SBarry Smith
870*9b92b1d3SBarry Smith-s
871*9b92b1d3SBarry Smith
872*9b92b1d3SBarry Smithnes_lag_jacobian
873*9b92b1d3SBarry Smith
874*9b92b1d3SBarry Smith(`SNESSetLagJacobian()`) How often Jacobian is rebuilt (use -1 to
875*9b92b1d3SBarry Smithnever rebuild, use -2 to rebuild the next time requested and then
876*9b92b1d3SBarry Smithnever again).
877*9b92b1d3SBarry Smith
878*9b92b1d3SBarry Smith-s
879*9b92b1d3SBarry Smith
880*9b92b1d3SBarry Smithnes_lag_jacobian_persists
881*9b92b1d3SBarry Smith
882*9b92b1d3SBarry Smith(`SNESSetLagJacobianPersists()`) Forces lagging of Jacobian
883*9b92b1d3SBarry Smiththrough multiple `SNES` solves , same as passing -2 to
884*9b92b1d3SBarry Smith`-snes_lag_jacobian`. By default, each new `SNES` solve
885*9b92b1d3SBarry Smithnormally triggers a recomputation of the Jacobian.
886*9b92b1d3SBarry Smith
887*9b92b1d3SBarry Smith-s
888*9b92b1d3SBarry Smith
889*9b92b1d3SBarry Smithnes_lag_preconditioner
890*9b92b1d3SBarry Smith
891*9b92b1d3SBarry Smith(`SNESSetLagPreconditioner()`) how often the preconditioner is
892*9b92b1d3SBarry Smithrebuilt. Note: if you are lagging the Jacobian the system will
893*9b92b1d3SBarry Smithknow that the matrix has not changed and will not recompute the
894*9b92b1d3SBarry Smith(same) preconditioner.
895*9b92b1d3SBarry Smith
896*9b92b1d3SBarry Smith-s
897*9b92b1d3SBarry Smith
898*9b92b1d3SBarry Smithnes_lag_preconditioner_persists
899*9b92b1d3SBarry Smith
900*9b92b1d3SBarry Smith(`SNESSetLagPreconditionerPersists()`) Preconditioner
901*9b92b1d3SBarry Smithlags through multiple `SNES` solves
902*9b92b1d3SBarry Smith
903*9b92b1d3SBarry Smith:::{note}
904*9b92b1d3SBarry SmithThese are often (but does not need to be) used in combination with
905*9b92b1d3SBarry Smith`-snes_mf_operator` which applies the fresh Jacobian matrix-free for every
906*9b92b1d3SBarry Smithmatrix-vector product. Otherwise the out-of-date matrix vector product, computed with
907*9b92b1d3SBarry Smiththe lagged Jacobian will be used.
908*9b92b1d3SBarry Smith:::
909*9b92b1d3SBarry Smith
910*9b92b1d3SBarry SmithBy using `KSPMonitorSet()` and/or `SNESMonitorSet()` one can provide code that monitors the
911*9b92b1d3SBarry Smithconvergence rate and automatically triggers an update of the Jacobian or preconditioner
912*9b92b1d3SBarry Smithbased on decreasing convergence of the iterative method. For example if the number of `SNES`
913*9b92b1d3SBarry Smithiterations doubles one might trigger a new computation of the Jacobian. Experimentation is
914*9b92b1d3SBarry Smiththe only general purpose way to determine which approach is best for your problem.
915*9b92b1d3SBarry Smith
916*9b92b1d3SBarry Smith:::{important}
917*9b92b1d3SBarry SmithIt is also vital to experiment on your true problem at the scale you will be solving
918*9b92b1d3SBarry Smiththe problem since the performance benefits depend on the exact problem and the problem
919*9b92b1d3SBarry Smithsize!
920*9b92b1d3SBarry Smith:::
921*9b92b1d3SBarry Smith
922*9b92b1d3SBarry Smith### How can I use Newton's Method Jacobian free? Can I difference a different function than provided with `SNESSetFunction()`?
923*9b92b1d3SBarry Smith
924*9b92b1d3SBarry SmithThe simplest way is with the option `-snes_mf`, this will use finite differencing of the
925*9b92b1d3SBarry Smithfunction provided to `SNESComputeFunction()` to approximate the action of Jacobian.
926*9b92b1d3SBarry Smith
927*9b92b1d3SBarry Smith:::{important}
928*9b92b1d3SBarry SmithSince no matrix-representation of the Jacobian is provided the `-pc_type` used with
929*9b92b1d3SBarry Smiththis option must be `-pc_type none`. You may provide a custom preconditioner with
930*9b92b1d3SBarry Smith`SNESGetKSP()`, `KSPGetPC()`, and `PCSetType()` and use `PCSHELL`.
931*9b92b1d3SBarry Smith:::
932*9b92b1d3SBarry Smith
933*9b92b1d3SBarry SmithThe option `-snes_mf_operator` will use Jacobian free to apply the Jacobian (in the
934*9b92b1d3SBarry SmithKrylov solvers) but will use whatever matrix you provided with `SNESSetJacobian()`
935*9b92b1d3SBarry Smith(assuming you set one) to compute the preconditioner.
936*9b92b1d3SBarry Smith
937*9b92b1d3SBarry SmithTo write the code (rather than use the options above) use `MatCreateSNESMF()` and pass
938*9b92b1d3SBarry Smiththe resulting matrix object to `SNESSetJacobian()`.
939*9b92b1d3SBarry Smith
940*9b92b1d3SBarry SmithFor purely matrix-free (like `-snes_mf`) pass the matrix object for both matrix
941*9b92b1d3SBarry Smitharguments and pass the function `MatMFFDComputeJacobian()`.
942*9b92b1d3SBarry Smith
943*9b92b1d3SBarry SmithTo provide your own approximate Jacobian matrix to compute the preconditioner (like
944*9b92b1d3SBarry Smith`-snes_mf_operator`), pass this other matrix as the second matrix argument to
945*9b92b1d3SBarry Smith`SNESSetJacobian()`. Make sure your provided `computejacobian()` function calls
946*9b92b1d3SBarry Smith`MatAssemblyBegin()` and `MatAssemblyEnd()` separately on **BOTH** matrix arguments
947*9b92b1d3SBarry Smithto this function. See `src/snes/tests/ex7.c`.
948*9b92b1d3SBarry Smith
949*9b92b1d3SBarry SmithTo difference a different function than that passed to `SNESSetJacobian()` to compute the
950*9b92b1d3SBarry Smithmatrix-free Jacobian multiply call `MatMFFDSetFunction()` to set that other function. See
951*9b92b1d3SBarry Smith`src/snes/tests/ex7.c.h`.
952*9b92b1d3SBarry Smith
953*9b92b1d3SBarry Smith(doc_faq_usage_condnum)=
954*9b92b1d3SBarry Smith
955*9b92b1d3SBarry Smith### How can I determine the condition number of a matrix?
956*9b92b1d3SBarry Smith
957*9b92b1d3SBarry SmithFor small matrices, the condition number can be reliably computed using
958*9b92b1d3SBarry Smith
959*9b92b1d3SBarry Smith```text
960*9b92b1d3SBarry Smith-pc_type svd -pc_svd_monitor
961*9b92b1d3SBarry Smith```
962*9b92b1d3SBarry Smith
963*9b92b1d3SBarry SmithFor larger matrices, you can run with
964*9b92b1d3SBarry Smith
965*9b92b1d3SBarry Smith```text
966*9b92b1d3SBarry Smith-pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000
967*9b92b1d3SBarry Smith```
968*9b92b1d3SBarry Smith
969*9b92b1d3SBarry Smithto get approximations to the condition number of the operator. This will generally be
970*9b92b1d3SBarry Smithaccurate for the largest singular values, but may overestimate the smallest singular value
971*9b92b1d3SBarry Smithunless the method has converged. Make sure to avoid restarts. To estimate the condition
972*9b92b1d3SBarry Smithnumber of the preconditioned operator, use `-pc_type somepc` in the last command.
973*9b92b1d3SBarry Smith
974*9b92b1d3SBarry SmithYou can use [SLEPc](https://slepc.upv.es) for highly scalable, efficient, and quality eigenvalue computations.
975*9b92b1d3SBarry Smith
976*9b92b1d3SBarry Smith### How can I compute the inverse of a matrix in PETSc?
977*9b92b1d3SBarry Smith
978*9b92b1d3SBarry Smith:::{admonition} Are you sure?
979*9b92b1d3SBarry Smith:class: yellow
980*9b92b1d3SBarry Smith
981*9b92b1d3SBarry SmithIt is very expensive to compute the inverse of a matrix and very rarely needed in
982*9b92b1d3SBarry Smithpractice. We highly recommend avoiding algorithms that need it.
983*9b92b1d3SBarry Smith:::
984*9b92b1d3SBarry Smith
985*9b92b1d3SBarry SmithThe inverse of a matrix (dense or sparse) is essentially always dense, so begin by
986*9b92b1d3SBarry Smithcreating a dense matrix B and fill it with the identity matrix (ones along the diagonal),
987*9b92b1d3SBarry Smithalso create a dense matrix X of the same size that will hold the solution. Then factor the
988*9b92b1d3SBarry Smithmatrix you wish to invert with `MatLUFactor()` or `MatCholeskyFactor()`, call the
989*9b92b1d3SBarry Smithresult A. Then call `MatMatSolve(A,B,X)` to compute the inverse into X. See also section
990*9b92b1d3SBarry Smithon {any}`Schur's complement <how_can_i_compute_the_schur_complement>`.
991*9b92b1d3SBarry Smith
992*9b92b1d3SBarry Smith(how_can_i_compute_the_schur_complement)=
993*9b92b1d3SBarry Smith
994*9b92b1d3SBarry Smith### How can I compute the Schur complement in PETSc?
995*9b92b1d3SBarry Smith
996*9b92b1d3SBarry Smith:::{admonition} Are you sure?
997*9b92b1d3SBarry Smith:class: yellow
998*9b92b1d3SBarry Smith
999*9b92b1d3SBarry SmithIt is very expensive to compute the Schur complement of a matrix and very rarely needed
1000*9b92b1d3SBarry Smithin practice. We highly recommend avoiding algorithms that need it.
1001*9b92b1d3SBarry Smith:::
1002*9b92b1d3SBarry Smith
1003*9b92b1d3SBarry SmithThe Schur complement of the matrix $M \in \mathbb{R}^{\left(p+q \right) \times
1004*9b92b1d3SBarry Smith\left(p + q \right)}$
1005*9b92b1d3SBarry Smith
1006*9b92b1d3SBarry Smith$$
1007*9b92b1d3SBarry SmithM = \begin{bmatrix}
1008*9b92b1d3SBarry SmithA & B \\
1009*9b92b1d3SBarry SmithC & D
1010*9b92b1d3SBarry Smith\end{bmatrix}
1011*9b92b1d3SBarry Smith$$
1012*9b92b1d3SBarry Smith
1013*9b92b1d3SBarry Smithwhere
1014*9b92b1d3SBarry Smith
1015*9b92b1d3SBarry Smith$$
1016*9b92b1d3SBarry SmithA \in \mathbb{R}^{p \times p}, \quad B \in \mathbb{R}^{p \times q}, \quad C \in \mathbb{R}^{q \times p}, \quad D \in \mathbb{R}^{q \times q}
1017*9b92b1d3SBarry Smith$$
1018*9b92b1d3SBarry Smith
1019*9b92b1d3SBarry Smithis given by
1020*9b92b1d3SBarry Smith
1021*9b92b1d3SBarry Smith$$
1022*9b92b1d3SBarry SmithS_D := A - BD^{-1}C \\
1023*9b92b1d3SBarry SmithS_A := D - CA^{-1}B
1024*9b92b1d3SBarry Smith$$
1025*9b92b1d3SBarry Smith
1026*9b92b1d3SBarry SmithLike the inverse, the Schur complement of a matrix (dense or sparse) is essentially always
1027*9b92b1d3SBarry Smithdense, so assuming you wish to calculate $S_A = D - C \underbrace{
1028*9b92b1d3SBarry Smith\overbrace{(A^{-1})}^{U} B}_{V}$ begin by:
1029*9b92b1d3SBarry Smith
1030*9b92b1d3SBarry Smith1. Forming a dense matrix $B$
1031*9b92b1d3SBarry Smith2. Also create another dense matrix $V$ of the same size.
1032*9b92b1d3SBarry Smith3. Then either factor the matrix $A$ directly with `MatLUFactor()` or
1033*9b92b1d3SBarry Smith   `MatCholeskyFactor()`, or use `MatGetFactor()` followed by
1034*9b92b1d3SBarry Smith   `MatLUFactorSymbolic()` followed by `MatLUFactorNumeric()` if you wish to use and
1035*9b92b1d3SBarry Smith   external solver package like SuperLU_Dist. Call the result $U$.
1036*9b92b1d3SBarry Smith4. Then call `MatMatSolve(U,B,V)`.
1037*9b92b1d3SBarry Smith5. Then call `MatMatMult(C,V,MAT_INITIAL_MATRIX,1.0,&S)`.
1038*9b92b1d3SBarry Smith6. Now call `MatAXPY(S,-1.0,D,MAT_SUBSET_NONZERO)`.
1039*9b92b1d3SBarry Smith7. Followed by `MatScale(S,-1.0)`.
1040*9b92b1d3SBarry Smith
1041*9b92b1d3SBarry SmithFor computing Schur complements like this it does not make sense to use the `KSP`
1042*9b92b1d3SBarry Smithiterative solvers since for solving many moderate size problems using a direct
1043*9b92b1d3SBarry Smithfactorization is much faster than iterative solvers. As you can see, this requires a great
1044*9b92b1d3SBarry Smithdeal of work space and computation so is best avoided.
1045*9b92b1d3SBarry Smith
1046*9b92b1d3SBarry SmithHowever, it is not necessary to assemble the Schur complement $S$ in order to solve
1047*9b92b1d3SBarry Smithsystems with it. Use `MatCreateSchurComplement(A,A_pre,B,C,D,&S)` to create a
1048*9b92b1d3SBarry Smithmatrix that applies the action of $S$ (using `A_pre` to solve with `A`), but
1049*9b92b1d3SBarry Smithdoes not assemble.
1050*9b92b1d3SBarry Smith
1051*9b92b1d3SBarry SmithAlternatively, if you already have a block matrix `M = [A, B; C, D]` (in some
1052*9b92b1d3SBarry Smithordering), then you can create index sets (`IS`) `isa` and `isb` to address each
1053*9b92b1d3SBarry Smithblock, then use `MatGetSchurComplement()` to create the Schur complement and/or an
1054*9b92b1d3SBarry Smithapproximation suitable for preconditioning.
1055*9b92b1d3SBarry Smith
1056*9b92b1d3SBarry SmithSince $S$ is generally dense, standard preconditioning methods cannot typically be
1057*9b92b1d3SBarry Smithapplied directly to Schur complements. There are many approaches to preconditioning Schur
1058*9b92b1d3SBarry Smithcomplements including using the `SIMPLE` approximation
1059*9b92b1d3SBarry Smith
1060*9b92b1d3SBarry Smith$$
1061*9b92b1d3SBarry SmithD - C \text{diag}(A)^{-1} B
1062*9b92b1d3SBarry Smith$$
1063*9b92b1d3SBarry Smith
1064*9b92b1d3SBarry Smithto create a sparse matrix that approximates the Schur complement (this is returned by
1065*9b92b1d3SBarry Smithdefault for the optional "preconditioning" matrix in `MatGetSchurComplement()`).
1066*9b92b1d3SBarry Smith
1067*9b92b1d3SBarry SmithAn alternative is to interpret the matrices as differential operators and apply
1068*9b92b1d3SBarry Smithapproximate commutator arguments to find a spectrally equivalent operation that can be
1069*9b92b1d3SBarry Smithapplied efficiently (see the "PCD" preconditioners {cite}`elman_silvester_wathen_2014`). A
1070*9b92b1d3SBarry Smithvariant of this is the least squares commutator, which is closely related to the
1071*9b92b1d3SBarry SmithMoore-Penrose pseudoinverse, and is available in `PCLSC` which operates on matrices of
1072*9b92b1d3SBarry Smithtype `MATSCHURCOMPLEMENT`.
1073*9b92b1d3SBarry Smith
1074*9b92b1d3SBarry Smith### Do you have examples of doing unstructured grid Finite Element Method (FEM) with PETSc?
1075*9b92b1d3SBarry Smith
1076*9b92b1d3SBarry SmithThere are at least three ways to write finite element codes using PETSc:
1077*9b92b1d3SBarry Smith
1078*9b92b1d3SBarry Smith1. Use `DMPLEX`, which is a high level approach to manage your mesh and
1079*9b92b1d3SBarry Smith   discretization. See the {ref}`tutorials sections <tut_stokes>` for further information,
1080*9b92b1d3SBarry Smith   or see `src/snes/tutorial/ex62.c`.
1081*9b92b1d3SBarry Smith2. Use packages such as [deal.ii](https://www.dealii.org), [libMesh](https://libmesh.github.io), or
1082*9b92b1d3SBarry Smith   [Firedrake](https://www.firedrakeproject.org), which use PETSc for the solvers.
1083*9b92b1d3SBarry Smith3. Manage the grid data structure yourself and use PETSc `PetscSF`, `IS`, and `VecScatter` to
1084*9b92b1d3SBarry Smith   communicate the required ghost point communication. See
1085*9b92b1d3SBarry Smith   `src/snes/tutorials/ex10d/ex10.c`.
1086*9b92b1d3SBarry Smith
1087*9b92b1d3SBarry Smith### DMDA decomposes the domain differently than the MPI_Cart_create() command. How can one use them together?
1088*9b92b1d3SBarry Smith
1089*9b92b1d3SBarry SmithThe `MPI_Cart_create()` first divides the mesh along the z direction, then the y, then
1090*9b92b1d3SBarry Smiththe x. `DMDA` divides along the x, then y, then z. Thus, for example, rank 1 of the
1091*9b92b1d3SBarry Smithprocesses will be in a different part of the mesh for the two schemes. To resolve this you
1092*9b92b1d3SBarry Smithcan create a new MPI communicator that you pass to `DMDACreate()` that renumbers the
1093*9b92b1d3SBarry Smithprocess ranks so that each physical process shares the same part of the mesh with both the
1094*9b92b1d3SBarry Smith`DMDA` and the `MPI_Cart_create()`. The code to determine the new numbering was
1095*9b92b1d3SBarry Smithprovided by Rolf Kuiper:
1096*9b92b1d3SBarry Smith
1097*9b92b1d3SBarry Smith```
1098*9b92b1d3SBarry Smith// the numbers of processors per direction are (int) x_procs, y_procs, z_procs respectively
1099*9b92b1d3SBarry Smith// (no parallelization in direction 'dir' means dir_procs = 1)
1100*9b92b1d3SBarry Smith
1101*9b92b1d3SBarry SmithMPI_Comm    NewComm;
1102*9b92b1d3SBarry Smithint         x, y, z;
1103*9b92b1d3SBarry SmithPetscMPIInt MPI_Rank, NewRank;
1104*9b92b1d3SBarry Smith
1105*9b92b1d3SBarry Smith// get rank from MPI ordering:
1106*9b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &MPI_Rank));
1107*9b92b1d3SBarry Smith
1108*9b92b1d3SBarry Smith// calculate coordinates of cpus in MPI ordering:
1109*9b92b1d3SBarry Smithx = MPI_rank / (z_procs*y_procs);
1110*9b92b1d3SBarry Smithy = (MPI_rank % (z_procs*y_procs)) / z_procs;
1111*9b92b1d3SBarry Smithz = (MPI_rank % (z_procs*y_procs)) % z_procs;
1112*9b92b1d3SBarry Smith
1113*9b92b1d3SBarry Smith// set new rank according to PETSc ordering:
1114*9b92b1d3SBarry SmithNewRank = z*y_procs*x_procs + y*x_procs + x;
1115*9b92b1d3SBarry Smith
1116*9b92b1d3SBarry Smith// create communicator with new ranks according to PETSc ordering
1117*9b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &NewComm));
1118*9b92b1d3SBarry Smith
1119*9b92b1d3SBarry Smith// override the default communicator (was MPI_COMM_WORLD as default)
1120*9b92b1d3SBarry SmithPETSC_COMM_WORLD = NewComm;
1121*9b92b1d3SBarry Smith```
1122*9b92b1d3SBarry Smith
1123*9b92b1d3SBarry Smith### When solving a system with Dirichlet boundary conditions I can use MatZeroRows() to eliminate the Dirichlet rows but this results in a non-Symmetric system. How can I apply Dirichlet boundary conditions but keep the matrix symmetric?
1124*9b92b1d3SBarry Smith
1125*9b92b1d3SBarry Smith- For nonsymmetric systems put the appropriate boundary solutions in the x vector and use
1126*9b92b1d3SBarry Smith  `MatZeroRows()` followed by `KSPSetOperators()`.
1127*9b92b1d3SBarry Smith- For symmetric problems use `MatZeroRowsColumns()`.
1128*9b92b1d3SBarry Smith- If you have many Dirichlet locations you can use `MatZeroRows()` (**not**
1129*9b92b1d3SBarry Smith  `MatZeroRowsColumns()`) and `-ksp_type preonly -pc_type redistribute` (see
1130*9b92b1d3SBarry Smith  `PCREDISTRIBUTE`) and PETSc will repartition the parallel matrix for load
1131*9b92b1d3SBarry Smith  balancing. In this case the new matrix solved remains symmetric even though
1132*9b92b1d3SBarry Smith  `MatZeroRows()` is used.
1133*9b92b1d3SBarry Smith
1134*9b92b1d3SBarry SmithAn alternative approach is, when assembling the matrix (generating values and passing
1135*9b92b1d3SBarry Smiththem to the matrix), never to include locations for the Dirichlet grid points in the
1136*9b92b1d3SBarry Smithvector and matrix, instead taking them into account as you put the other values into the
1137*9b92b1d3SBarry Smithload.
1138*9b92b1d3SBarry Smith
1139*9b92b1d3SBarry Smith### How can I get PETSc vectors and matrices to MATLAB or vice versa?
1140*9b92b1d3SBarry Smith
1141*9b92b1d3SBarry SmithThere are numerous ways to work with PETSc and MATLAB. All but the first approach
1142*9b92b1d3SBarry Smithrequire PETSc to be configured with --with-matlab.
1143*9b92b1d3SBarry Smith
1144*9b92b1d3SBarry Smith1. To save PETSc `Mat` and `Vec` to files that can be read from MATLAB use
1145*9b92b1d3SBarry Smith   `PetscViewerBinaryOpen()` viewer and `VecView()` or `MatView()` to save objects
1146*9b92b1d3SBarry Smith   for MATLAB and `VecLoad()` and `MatLoad()` to get the objects that MATLAB has
1147*9b92b1d3SBarry Smith   saved. See `share/petsc/matlab/PetscBinaryRead.m` and
1148*9b92b1d3SBarry Smith   `share/petsc/matlab/PetscBinaryWrite.m` for loading and saving the objects in MATLAB.
1149*9b92b1d3SBarry Smith2. Using the [MATLAB Engine](https://www.mathworks.com/help/matlab/calling-matlab-engine-from-c-programs-1.html),
1150*9b92b1d3SBarry Smith   allows PETSc to automatically call MATLAB to perform some specific computations. This
1151*9b92b1d3SBarry Smith   does not allow MATLAB to be used interactively by the user. See the
1152*9b92b1d3SBarry Smith   `PetscMatlabEngine`.
1153*9b92b1d3SBarry Smith3. You can open a socket connection between MATLAB and PETSc to allow sending objects back
1154*9b92b1d3SBarry Smith   and forth between an interactive MATLAB session and a running PETSc program. See
1155*9b92b1d3SBarry Smith   `PetscViewerSocketOpen()` for access from the PETSc side and
1156*9b92b1d3SBarry Smith   `share/petsc/matlab/PetscReadBinary.m` for access from the MATLAB side.
1157*9b92b1d3SBarry Smith4. You can save PETSc `Vec` (**not** `Mat`) with the `PetscViewerMatlabOpen()`
1158*9b92b1d3SBarry Smith   viewer that saves `.mat` files can then be loaded into MATLAB using the `load()` command
1159*9b92b1d3SBarry Smith
1160*9b92b1d3SBarry Smith### How do I get started with Cython so that I can extend petsc4py?
1161*9b92b1d3SBarry Smith
1162*9b92b1d3SBarry Smith1. Learn how to [build a Cython module](http://docs.cython.org/src/quickstart/build.html).
1163*9b92b1d3SBarry Smith2. Go through the simple [example](https://stackoverflow.com/questions/3046305/simple-wrapping-of-c-code-with-cython). Note
1164*9b92b1d3SBarry Smith   also the next comment that shows how to create numpy arrays in the Cython and pass them
1165*9b92b1d3SBarry Smith   back.
1166*9b92b1d3SBarry Smith3. Check out [this page](http://docs.cython.org/src/tutorial/numpy.html) which tells
1167*9b92b1d3SBarry Smith   you how to get fast indexing.
1168*9b92b1d3SBarry Smith4. Have a look at the petsc4py [array source](http://code.google.com/p/petsc4py/source/browse/src/PETSc/arraynpy.pxi).
1169*9b92b1d3SBarry Smith
1170*9b92b1d3SBarry Smith### How do I compute a custom norm for KSP to use as a convergence test or for monitoring?
1171*9b92b1d3SBarry Smith
1172*9b92b1d3SBarry SmithYou need to call `KSPBuildResidual()` on your `KSP` object and then compute the
1173*9b92b1d3SBarry Smithappropriate norm on the resulting residual. Note that depending on the
1174*9b92b1d3SBarry Smith`KSPSetNormType()` of the method you may not return the same norm as provided by the
1175*9b92b1d3SBarry Smithmethod. See also `KSPSetPCSide()`.
1176*9b92b1d3SBarry Smith
1177*9b92b1d3SBarry Smith### If I have a sequential program can I use a PETSc parallel solver?
1178*9b92b1d3SBarry Smith
1179*9b92b1d3SBarry Smith:::{important}
1180*9b92b1d3SBarry SmithDo not expect to get great speedups! Much of the speedup gained by using parallel
1181*9b92b1d3SBarry Smithsolvers comes from building the underlying matrices and vectors in parallel to begin
1182*9b92b1d3SBarry Smithwith. You should see some reduction in the time for the linear solvers.
1183*9b92b1d3SBarry Smith:::
1184*9b92b1d3SBarry Smith
1185*9b92b1d3SBarry SmithYes, you must set up PETSc with MPI (even though you will not use MPI) with at least the
1186*9b92b1d3SBarry Smithfollowing options:
1187*9b92b1d3SBarry Smith
1188*9b92b1d3SBarry Smith```console
1189*9b92b1d3SBarry Smith$ ./configure --download-superlu_dist --download-parmetis --download-metis --with-openmp
1190*9b92b1d3SBarry Smith```
1191*9b92b1d3SBarry Smith
1192*9b92b1d3SBarry SmithYour compiler must support OpenMP. To have the linear solver run in parallel run your
1193*9b92b1d3SBarry Smithprogram with
1194*9b92b1d3SBarry Smith
1195*9b92b1d3SBarry Smith```console
1196*9b92b1d3SBarry Smith$ OMP_NUM_THREADS=n ./myprog -pc_type lu -pc_factor_mat_solver superlu_dist
1197*9b92b1d3SBarry Smith```
1198*9b92b1d3SBarry Smith
1199*9b92b1d3SBarry Smithwhere `n` is the number of threads and should be less than or equal to the number of cores
1200*9b92b1d3SBarry Smithavailable.
1201*9b92b1d3SBarry Smith
1202*9b92b1d3SBarry Smith:::{note}
1203*9b92b1d3SBarry SmithIf your code is MPI parallel you can also use these same options to have SuperLU_dist
1204*9b92b1d3SBarry Smithutilize multiple threads per MPI process for the direct solver. Make sure that the
1205*9b92b1d3SBarry Smith`$OMP_NUM_THREADS` you use per MPI process is less than or equal to the number of
1206*9b92b1d3SBarry Smithcores available for each MPI process. For example if your compute nodes have 6 cores
1207*9b92b1d3SBarry Smithand you use 2 MPI processes per node then set `$OMP_NUM_THREADS` to 2 or 3.
1208*9b92b1d3SBarry Smith:::
1209*9b92b1d3SBarry Smith
1210*9b92b1d3SBarry SmithAnother approach that allows using a PETSc parallel solver is to use `PCMPI`.
1211*9b92b1d3SBarry Smith
1212*9b92b1d3SBarry Smith### TS or SNES produces infeasible (out of domain) solutions or states. How can I prevent this?
1213*9b92b1d3SBarry Smith
1214*9b92b1d3SBarry SmithFor `TS` call `TSSetFunctionDomainError()`. For both `TS` and `SNES` call
1215*9b92b1d3SBarry Smith`SNESSetFunctionDomainError()` when the solver passes an infeasible (out of domain)
1216*9b92b1d3SBarry Smithsolution or state to your routines.
1217*9b92b1d3SBarry Smith
1218*9b92b1d3SBarry SmithIf it occurs for DAEs, it is important to insure the algebraic constraints are well
1219*9b92b1d3SBarry Smithsatisfied, which can prevent "breakdown" later. Thus, one can try using a tight tolerance
1220*9b92b1d3SBarry Smithfor `SNES`, using a direct linear solver (`PCType` of `PCLU`) when possible, and reducing the timestep (or
1221*9b92b1d3SBarry Smithtightening `TS` tolerances for adaptive time stepping).
1222*9b92b1d3SBarry Smith
1223*9b92b1d3SBarry Smith### Can PETSc work with Hermitian matrices?
1224*9b92b1d3SBarry Smith
1225*9b92b1d3SBarry SmithPETSc's support of Hermitian matrices is limited. Many operations and solvers work
1226*9b92b1d3SBarry Smithfor symmetric (`MATSBAIJ`) matrices and operations on transpose matrices but there is
1227*9b92b1d3SBarry Smithlittle direct support for Hermitian matrices and Hermitian transpose (complex conjugate
1228*9b92b1d3SBarry Smithtranspose) operations. There is `KSPSolveTranspose()` for solving the transpose of a
1229*9b92b1d3SBarry Smithlinear system but no `KSPSolveHermitian()`.
1230*9b92b1d3SBarry Smith
1231*9b92b1d3SBarry SmithFor creating known Hermitian matrices:
1232*9b92b1d3SBarry Smith
1233*9b92b1d3SBarry Smith- `MatCreateNormalHermitian()`
1234*9b92b1d3SBarry Smith- `MatCreateHermitianTranspose()`
1235*9b92b1d3SBarry Smith
1236*9b92b1d3SBarry SmithFor determining or setting Hermitian status on existing matrices:
1237*9b92b1d3SBarry Smith
1238*9b92b1d3SBarry Smith- `MatIsHermitian()`
1239*9b92b1d3SBarry Smith- `MatIsHermitianKnown()`
1240*9b92b1d3SBarry Smith- `MatIsStructurallySymmetric()`
1241*9b92b1d3SBarry Smith- `MatIsSymmetricKnown()`
1242*9b92b1d3SBarry Smith- `MatIsSymmetric()`
1243*9b92b1d3SBarry Smith- `MatSetOption()` (use with `MAT_SYMMETRIC` or `MAT_HERMITIAN` to assert to PETSc
1244*9b92b1d3SBarry Smith  that either is the case).
1245*9b92b1d3SBarry Smith
1246*9b92b1d3SBarry SmithFor performing matrix operations on known Hermitian matrices (note that regular `Mat`
1247*9b92b1d3SBarry Smithfunctions such as `MatMult()` will of course also work on Hermitian matrices):
1248*9b92b1d3SBarry Smith
1249*9b92b1d3SBarry Smith- `MatMultHermitianTranspose()`
1250*9b92b1d3SBarry Smith- `MatMultHermitianTransposeAdd()` (very limited support)
1251*9b92b1d3SBarry Smith
1252*9b92b1d3SBarry Smith### How can I assemble a bunch of similar matrices?
1253*9b92b1d3SBarry Smith
1254*9b92b1d3SBarry SmithYou can first add the values common to all the matrices, then use `MatStoreValues()` to
1255*9b92b1d3SBarry Smithstash the common values. Each iteration you call `MatRetrieveValues()`, then set the
1256*9b92b1d3SBarry Smithunique values in a matrix and assemble.
1257*9b92b1d3SBarry Smith
1258*9b92b1d3SBarry Smith### Can one resize or change the size of PETSc matrices or vectors?
1259*9b92b1d3SBarry Smith
1260*9b92b1d3SBarry SmithNo, once the vector or matrices sizes have been set and the matrices or vectors are fully
1261*9b92b1d3SBarry Smithusable one cannot change the size of the matrices or vectors or number of processors they
1262*9b92b1d3SBarry Smithlive on. One may create new vectors and copy, for example using `VecScatterCreate()`,
1263*9b92b1d3SBarry Smiththe old values from the previous vector.
1264*9b92b1d3SBarry Smith
1265*9b92b1d3SBarry Smith### How can one compute the nullspace of a sparse matrix with MUMPS?
1266*9b92b1d3SBarry Smith
1267*9b92b1d3SBarry SmithAssuming you have an existing matrix $A$ whose nullspace $V$ you want to find:
1268*9b92b1d3SBarry Smith
1269*9b92b1d3SBarry Smith```
1270*9b92b1d3SBarry SmithMat      F, work, V;
1271*9b92b1d3SBarry SmithPetscInt N, rows;
1272*9b92b1d3SBarry Smith
1273*9b92b1d3SBarry Smith/* Determine factorability */
1274*9b92b1d3SBarry SmithPetscCall(MatGetFactor(A, MATSOLVERMUMPS, MAT_FACTOR_LU, &F));
1275*9b92b1d3SBarry SmithPetscCall(MatGetLocalSize(A, &rows, NULL));
1276*9b92b1d3SBarry Smith
1277*9b92b1d3SBarry Smith/* Set MUMPS options, see MUMPS documentation for more information */
1278*9b92b1d3SBarry SmithPetscCall(MatMumpsSetIcntl(F, 24, 1));
1279*9b92b1d3SBarry SmithPetscCall(MatMumpsSetIcntl(F, 25, 1));
1280*9b92b1d3SBarry Smith
1281*9b92b1d3SBarry Smith/* Perform factorization */
1282*9b92b1d3SBarry SmithPetscCall(MatLUFactorSymbolic(F, A, NULL, NULL, NULL));
1283*9b92b1d3SBarry SmithPetscCall(MatLUFactorNumeric(F, A, NULL));
1284*9b92b1d3SBarry Smith
1285*9b92b1d3SBarry Smith/* This is the dimension of the null space */
1286*9b92b1d3SBarry SmithPetscCall(MatMumpsGetInfog(F, 28, &N));
1287*9b92b1d3SBarry Smith
1288*9b92b1d3SBarry Smith/* This will contain the null space in the columns */
1289*9b92b1d3SBarry SmithPetscCall(MatCreateDense(comm, rows, N, PETSC_DETERMINE, PETSC_DETERMINE, NULL, &V));
1290*9b92b1d3SBarry Smith
1291*9b92b1d3SBarry SmithPetscCall(MatDuplicate(V, MAT_DO_NOT_COPY_VALUES, &work));
1292*9b92b1d3SBarry SmithPetscCall(MatMatSolve(F, work, V));
1293*9b92b1d3SBarry Smith```
1294*9b92b1d3SBarry Smith
1295*9b92b1d3SBarry Smith______________________________________________________________________
1296*9b92b1d3SBarry Smith
1297*9b92b1d3SBarry Smith## Execution
1298*9b92b1d3SBarry Smith
1299*9b92b1d3SBarry Smith### PETSc executables are SO big and take SO long to link
1300*9b92b1d3SBarry Smith
1301*9b92b1d3SBarry Smith:::{note}
1302*9b92b1d3SBarry SmithSee {ref}`shared libraries section <doc_faq_sharedlibs>` for more information.
1303*9b92b1d3SBarry Smith:::
1304*9b92b1d3SBarry Smith
1305*9b92b1d3SBarry SmithWe find this annoying as well. On most machines PETSc can use shared libraries, so
1306*9b92b1d3SBarry Smithexecutables should be much smaller, run `configure` with the additional option
1307*9b92b1d3SBarry Smith`--with-shared-libraries` (this is the default). Also, if you have room, compiling and
1308*9b92b1d3SBarry Smithlinking PETSc on your machine's `/tmp` disk or similar local disk, rather than over the
1309*9b92b1d3SBarry Smithnetwork will be much faster.
1310*9b92b1d3SBarry Smith
1311*9b92b1d3SBarry Smith### How does PETSc's `-help` option work? Why is it different for different programs?
1312*9b92b1d3SBarry Smith
1313*9b92b1d3SBarry SmithThere are 2 ways in which one interacts with the options database:
1314*9b92b1d3SBarry Smith
1315*9b92b1d3SBarry Smith- `PetscOptionsGetXXX()` where `XXX` is some type or data structure (for example
1316*9b92b1d3SBarry Smith  `PetscOptionsGetBool()` or `PetscOptionsGetScalarArray()`). This is a classic
1317*9b92b1d3SBarry Smith  "getter" function, which queries the command line options for a matching option name,
1318*9b92b1d3SBarry Smith  and returns the specified value.
1319*9b92b1d3SBarry Smith- `PetscOptionsXXX()` where `XXX` is some type or data structure (for example
1320*9b92b1d3SBarry Smith  `PetscOptionsBool()` or `PetscOptionsScalarArray()`). This is a so-called "provider"
1321*9b92b1d3SBarry Smith  function. It first records the option name in an internal list of previously encountered
1322*9b92b1d3SBarry Smith  options before calling `PetscOptionsGetXXX()` to query the status of said option.
1323*9b92b1d3SBarry Smith
1324*9b92b1d3SBarry SmithWhile users generally use the first option, developers will *always* use the second
1325*9b92b1d3SBarry Smith(provider) variant of functions. Thus, as the program runs, it will build up a list of
1326*9b92b1d3SBarry Smithencountered option names which are then printed **in the order of their appearance on the
1327*9b92b1d3SBarry Smithroot rank**. Different programs may take different paths through PETSc source code, so
1328*9b92b1d3SBarry Smiththey will encounter different providers, and therefore have different `-help` output.
1329*9b92b1d3SBarry Smith
1330*9b92b1d3SBarry Smith### PETSc has so many options for my program that it is hard to keep them straight
1331*9b92b1d3SBarry Smith
1332*9b92b1d3SBarry SmithRunning the PETSc program with the option `-help` will print out many of the options. To
1333*9b92b1d3SBarry Smithprint the options that have been specified within a program, employ `-options_left` to
1334*9b92b1d3SBarry Smithprint any options that the user specified but were not actually used by the program and
1335*9b92b1d3SBarry Smithall options used; this is helpful for detecting typo errors. The PETSc website has a search option,
1336*9b92b1d3SBarry Smithin the upper right hand corner, that quickly finds answers to most PETSc questions.
1337*9b92b1d3SBarry Smith
1338*9b92b1d3SBarry Smith### PETSc automatically handles many of the details in parallel PDE solvers. How can I understand what is really happening within my program?
1339*9b92b1d3SBarry Smith
1340*9b92b1d3SBarry SmithYou can use the option `-info` to get more details about the solution process. The
1341*9b92b1d3SBarry Smithoption `-log_view` provides details about the distribution of time spent in the various
1342*9b92b1d3SBarry Smithphases of the solution process. You can run with `-ts_view` or `-snes_view` or
1343*9b92b1d3SBarry Smith`-ksp_view` to see what solver options are being used. Run with `-ts_monitor`,
1344*9b92b1d3SBarry Smith`-snes_monitor`, or `-ksp_monitor` to watch convergence of the
1345*9b92b1d3SBarry Smithmethods. `-snes_converged_reason` and `-ksp_converged_reason` will indicate why and if
1346*9b92b1d3SBarry Smiththe solvers have converged.
1347*9b92b1d3SBarry Smith
1348*9b92b1d3SBarry Smith### Assembling large sparse matrices takes a long time. What can I do to make this process faster? Or MatSetValues() is so slow; what can I do to speed it up?
1349*9b92b1d3SBarry Smith
1350*9b92b1d3SBarry SmithYou probably need to do preallocation, as explained in {any}`sec_matsparse`.
1351*9b92b1d3SBarry SmithSee also the {ref}`performance chapter <ch_performance>` of the users manual.
1352*9b92b1d3SBarry Smith
1353*9b92b1d3SBarry SmithFor GPUs (and even CPUs) you can use `MatSetPreallocationCOO()` and `MatSetValuesCOO()` for more rapid assembly.
1354*9b92b1d3SBarry Smith
1355*9b92b1d3SBarry Smith### How can I generate performance summaries with PETSc?
1356*9b92b1d3SBarry Smith
1357*9b92b1d3SBarry SmithUse this option at runtime:
1358*9b92b1d3SBarry Smith
1359*9b92b1d3SBarry Smith-l
1360*9b92b1d3SBarry Smith
1361*9b92b1d3SBarry Smithog_view
1362*9b92b1d3SBarry Smith
1363*9b92b1d3SBarry SmithOutputs a comprehensive timing, memory consumption, and communications digest
1364*9b92b1d3SBarry Smithfor your program. See the {ref}`profiling chapter <ch_profiling>` of the users
1365*9b92b1d3SBarry Smithmanual for information on interpreting the summary data.
1366*9b92b1d3SBarry Smith
1367*9b92b1d3SBarry Smith### How do I know the amount of time spent on each level of the multigrid solver/preconditioner?
1368*9b92b1d3SBarry Smith
1369*9b92b1d3SBarry SmithRun with `-log_view` and `-pc_mg_log`
1370*9b92b1d3SBarry Smith
1371*9b92b1d3SBarry Smith### Where do I get the input matrices for the examples?
1372*9b92b1d3SBarry Smith
1373*9b92b1d3SBarry SmithSome examples use `$DATAFILESPATH/matrices/medium` and other files. These test matrices
1374*9b92b1d3SBarry Smithin PETSc binary format can be found in the [datafiles repository](https://gitlab.com/petsc/datafiles).
1375*9b92b1d3SBarry Smith
1376*9b92b1d3SBarry Smith### When I dump some matrices and vectors to binary, I seem to be generating some empty files with `.info` extensions. What's the deal with these?
1377*9b92b1d3SBarry Smith
1378*9b92b1d3SBarry SmithPETSc binary viewers put some additional information into `.info` files like matrix
1379*9b92b1d3SBarry Smithblock size. It is harmless but if you *really* don't like it you can use
1380*9b92b1d3SBarry Smith`-viewer_binary_skip_info` or `PetscViewerBinarySkipInfo()`.
1381*9b92b1d3SBarry Smith
1382*9b92b1d3SBarry Smith:::{note}
1383*9b92b1d3SBarry SmithYou need to call `PetscViewerBinarySkipInfo()` before
1384*9b92b1d3SBarry Smith`PetscViewerFileSetName()`. In other words you **cannot** use
1385*9b92b1d3SBarry Smith`PetscViewerBinaryOpen()` directly.
1386*9b92b1d3SBarry Smith:::
1387*9b92b1d3SBarry Smith
1388*9b92b1d3SBarry Smith### Why is my parallel solver slower than my sequential solver, or I have poor speed-up?
1389*9b92b1d3SBarry Smith
1390*9b92b1d3SBarry SmithThis can happen for many reasons:
1391*9b92b1d3SBarry Smith
1392*9b92b1d3SBarry Smith1. Make sure it is truly the time in `KSPSolve()` that is slower (by running the code
1393*9b92b1d3SBarry Smith   with `-log_view`). Often the slower time is in generating the matrix or some other
1394*9b92b1d3SBarry Smith   operation.
1395*9b92b1d3SBarry Smith2. There must be enough work for each process to outweigh the communication time. We
1396*9b92b1d3SBarry Smith   recommend an absolute minimum of about 10,000 unknowns per process, better is 20,000 or
1397*9b92b1d3SBarry Smith   more. This is even more true when using multiple GPUs, where you need to have millions
1398*9b92b1d3SBarry Smith   of unknowns per GPU.
1399*9b92b1d3SBarry Smith3. Make sure the {ref}`communication speed of the parallel computer
1400*9b92b1d3SBarry Smith   <doc_faq_general_parallel>` is good enough for parallel solvers.
1401*9b92b1d3SBarry Smith4. Check the number of solver iterates with the parallel solver against the sequential
1402*9b92b1d3SBarry Smith   solver. Most preconditioners require more iterations when used on more processes, this
1403*9b92b1d3SBarry Smith   is particularly true for block Jacobi (the default parallel preconditioner). You can
1404*9b92b1d3SBarry Smith   try `-pc_type asm` (`PCASM`) its iterations scale a bit better for more
1405*9b92b1d3SBarry Smith   processes. You may also consider multigrid preconditioners like `PCMG` or BoomerAMG
1406*9b92b1d3SBarry Smith   in `PCHYPRE`.
1407*9b92b1d3SBarry Smith
1408*9b92b1d3SBarry Smith(doc_faq_pipelined)=
1409*9b92b1d3SBarry Smith
1410*9b92b1d3SBarry Smith### What steps are necessary to make the pipelined solvers execute efficiently?
1411*9b92b1d3SBarry Smith
1412*9b92b1d3SBarry SmithPipelined solvers like `KSPPGMRES`, `KSPPIPECG`, `KSPPIPECR`, and `KSPGROPPCG` may
1413*9b92b1d3SBarry Smithrequire special MPI configuration to effectively overlap reductions with computation. In
1414*9b92b1d3SBarry Smithgeneral, this requires an MPI-3 implementation, an implementation that supports multiple
1415*9b92b1d3SBarry Smiththreads, and use of a "progress thread". Consult the documentation from your vendor or
1416*9b92b1d3SBarry Smithcomputing facility for more details.
1417*9b92b1d3SBarry Smith
1418*9b92b1d3SBarry Smith- Cray MPI MPT-5.6 MPI-3, by setting `$MPICH_MAX_THREAD_SAFETY` to "multiple"
1419*9b92b1d3SBarry Smith  for threads, plus either `$MPICH_ASYNC_PROGRESS` or
1420*9b92b1d3SBarry Smith  `$MPICH_NEMESIS_ASYNC_PROGRESS`. E.g.
1421*9b92b1d3SBarry Smith  ```console
1422*9b92b1d3SBarry Smith  $ export MPICH_MAX_THREAD_SAFETY=multiple
1423*9b92b1d3SBarry Smith  $ export MPICH_ASYNC_PROGRESS=1
1424*9b92b1d3SBarry Smith  $ export MPICH_NEMESIS_ASYNC_PROGRESS=1
1425*9b92b1d3SBarry Smith  ```
1426*9b92b1d3SBarry Smith
1427*9b92b1d3SBarry Smith- MPICH version 3.0 and later implements the MPI-3 standard and the default
1428*9b92b1d3SBarry Smith  configuration supports use of threads. Use of a progress thread is configured by
1429*9b92b1d3SBarry Smith  setting the environment variable `$MPICH_ASYNC_PROGRESS`. E.g.
1430*9b92b1d3SBarry Smith  ```console
1431*9b92b1d3SBarry Smith  $ export MPICH_ASYNC_PROGRESS=1
1432*9b92b1d3SBarry Smith  ```
1433*9b92b1d3SBarry Smith
1434*9b92b1d3SBarry Smith### When using PETSc in single precision mode (`--with-precision=single` when running `configure`) are the operations done in single or double precision?
1435*9b92b1d3SBarry Smith
1436*9b92b1d3SBarry SmithPETSc does **NOT** do any explicit conversion of single precision to double before
1437*9b92b1d3SBarry Smithperforming computations; it depends on the hardware and compiler for what happens. For
1438*9b92b1d3SBarry Smithexample, the compiler could choose to put the single precision numbers into the usual
1439*9b92b1d3SBarry Smithdouble precision registers and then use the usual double precision floating point unit. Or
1440*9b92b1d3SBarry Smithit could use SSE2 instructions that work directly on the single precision numbers. It is a
1441*9b92b1d3SBarry Smithbit of a mystery what decisions get made sometimes. There may be compiler flags in some
1442*9b92b1d3SBarry Smithcircumstances that can affect this.
1443*9b92b1d3SBarry Smith
1444*9b92b1d3SBarry Smith### Why is Newton's method (SNES) not converging, or converges slowly?
1445*9b92b1d3SBarry Smith
1446*9b92b1d3SBarry SmithNewton's method may not converge for many reasons, here are some of the most common:
1447*9b92b1d3SBarry Smith
1448*9b92b1d3SBarry Smith- The Jacobian is wrong (or correct in sequential but not in parallel).
1449*9b92b1d3SBarry Smith- The linear system is {ref}`not solved <doc_faq_execution_kspconv>` or is not solved
1450*9b92b1d3SBarry Smith  accurately enough.
1451*9b92b1d3SBarry Smith- The Jacobian system has a singularity that the linear solver is not handling.
1452*9b92b1d3SBarry Smith- There is a bug in the function evaluation routine.
1453*9b92b1d3SBarry Smith- The function is not continuous or does not have continuous first derivatives (e.g. phase
1454*9b92b1d3SBarry Smith  change or TVD limiters).
1455*9b92b1d3SBarry Smith- The equations may not have a solution (e.g. limit cycle instead of a steady state) or
1456*9b92b1d3SBarry Smith  there may be a "hill" between the initial guess and the steady state (e.g. reactants
1457*9b92b1d3SBarry Smith  must ignite and burn before reaching a steady state, but the steady-state residual will
1458*9b92b1d3SBarry Smith  be larger during combustion).
1459*9b92b1d3SBarry Smith
1460*9b92b1d3SBarry SmithHere are some of the ways to help debug lack of convergence of Newton:
1461*9b92b1d3SBarry Smith
1462*9b92b1d3SBarry Smith- Run on one processor to see if the problem is only in parallel.
1463*9b92b1d3SBarry Smith
1464*9b92b1d3SBarry Smith- Run with `-info` to get more detailed information on the solution process.
1465*9b92b1d3SBarry Smith
1466*9b92b1d3SBarry Smith- Run with the options
1467*9b92b1d3SBarry Smith
1468*9b92b1d3SBarry Smith  ```text
1469*9b92b1d3SBarry Smith  -snes_monitor -ksp_monitor_true_residual -snes_converged_reason -ksp_converged_reason
1470*9b92b1d3SBarry Smith  ```
1471*9b92b1d3SBarry Smith
1472*9b92b1d3SBarry Smith  - If the linear solve does not converge, check if the Jacobian is correct, then see
1473*9b92b1d3SBarry Smith    {ref}`this question <doc_faq_execution_kspconv>`.
1474*9b92b1d3SBarry Smith  - If the preconditioned residual converges, but the true residual does not, the
1475*9b92b1d3SBarry Smith    preconditioner may be singular.
1476*9b92b1d3SBarry Smith  - If the linear solve converges well, but the line search fails, the Jacobian may be
1477*9b92b1d3SBarry Smith    incorrect.
1478*9b92b1d3SBarry Smith
1479*9b92b1d3SBarry Smith- Run with `-pc_type lu` or `-pc_type svd` to see if the problem is a poor linear
1480*9b92b1d3SBarry Smith  solver.
1481*9b92b1d3SBarry Smith
1482*9b92b1d3SBarry Smith- Run with `-mat_view` or `-mat_view draw` to see if the Jacobian looks reasonable.
1483*9b92b1d3SBarry Smith
1484*9b92b1d3SBarry Smith- Run with `-snes_test_jacobian -snes_test_jacobian_view` to see if the Jacobian you are
1485*9b92b1d3SBarry Smith  using is wrong. Compare the output when you add `-mat_fd_type ds` to see if the result
1486*9b92b1d3SBarry Smith  is sensitive to the choice of differencing parameter.
1487*9b92b1d3SBarry Smith
1488*9b92b1d3SBarry Smith- Run with `-snes_mf_operator -pc_type lu` to see if the Jacobian you are using is
1489*9b92b1d3SBarry Smith  wrong. If the problem is too large for a direct solve, try
1490*9b92b1d3SBarry Smith
1491*9b92b1d3SBarry Smith  ```text
1492*9b92b1d3SBarry Smith  -snes_mf_operator -pc_type ksp -ksp_ksp_rtol 1e-12.
1493*9b92b1d3SBarry Smith  ```
1494*9b92b1d3SBarry Smith
1495*9b92b1d3SBarry Smith  Compare the output when you add `-mat_mffd_type ds` to see if the result is sensitive
1496*9b92b1d3SBarry Smith  to choice of differencing parameter.
1497*9b92b1d3SBarry Smith
1498*9b92b1d3SBarry Smith- Run with `-snes_linesearch_monitor` to see if the line search is failing (this is
1499*9b92b1d3SBarry Smith  usually a sign of a bad Jacobian). Use `-info` in PETSc 3.1 and older versions,
1500*9b92b1d3SBarry Smith  `-snes_ls_monitor` in PETSc 3.2 and `-snes_linesearch_monitor` in PETSc 3.3 and
1501*9b92b1d3SBarry Smith  later.
1502*9b92b1d3SBarry Smith
1503*9b92b1d3SBarry SmithHere are some ways to help the Newton process if everything above checks out:
1504*9b92b1d3SBarry Smith
1505*9b92b1d3SBarry Smith- Run with grid sequencing (`-snes_grid_sequence` if working with a `DM` is all you
1506*9b92b1d3SBarry Smith  need) to generate better initial guess on your finer mesh.
1507*9b92b1d3SBarry Smith
1508*9b92b1d3SBarry Smith- Run with quad precision, i.e.
1509*9b92b1d3SBarry Smith
1510*9b92b1d3SBarry Smith  ```console
1511*9b92b1d3SBarry Smith  $ ./configure --with-precision=__float128 --download-f2cblaslapack
1512*9b92b1d3SBarry Smith  ```
1513*9b92b1d3SBarry Smith
1514*9b92b1d3SBarry Smith  :::{note}
1515*9b92b1d3SBarry Smith  quad precision requires PETSc 3.2 and later and recent versions of the GNU compilers.
1516*9b92b1d3SBarry Smith  :::
1517*9b92b1d3SBarry Smith
1518*9b92b1d3SBarry Smith- Change the units (nondimensionalization), boundary condition scaling, or formulation so
1519*9b92b1d3SBarry Smith  that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and
1520*9b92b1d3SBarry Smith  Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127).
1521*9b92b1d3SBarry Smith
1522*9b92b1d3SBarry Smith- Mollify features in the function that do not have continuous first derivatives (often
1523*9b92b1d3SBarry Smith  occurs when there are "if" statements in the residual evaluation, e.g. phase change or
1524*9b92b1d3SBarry Smith  TVD limiters). Use a variational inequality solver (`SNESVINEWTONRSLS`) if the
1525*9b92b1d3SBarry Smith  discontinuities are of fundamental importance.
1526*9b92b1d3SBarry Smith
1527*9b92b1d3SBarry Smith- Try a trust region method (`-ts_type tr`, may have to adjust parameters).
1528*9b92b1d3SBarry Smith
1529*9b92b1d3SBarry Smith- Run with some continuation parameter from a point where you know the solution, see
1530*9b92b1d3SBarry Smith  `TSPSEUDO` for steady-states.
1531*9b92b1d3SBarry Smith
1532*9b92b1d3SBarry Smith- There are homotopy solver packages like PHCpack that can get you all possible solutions
1533*9b92b1d3SBarry Smith  (and tell you that it has found them all) but those are not scalable and cannot solve
1534*9b92b1d3SBarry Smith  anything but small problems.
1535*9b92b1d3SBarry Smith
1536*9b92b1d3SBarry Smith(doc_faq_execution_kspconv)=
1537*9b92b1d3SBarry Smith
1538*9b92b1d3SBarry Smith### Why is the linear solver (KSP) not converging, or converges slowly?
1539*9b92b1d3SBarry Smith
1540*9b92b1d3SBarry Smith:::{tip}
1541*9b92b1d3SBarry SmithAlways run with `-ksp_converged_reason -ksp_monitor_true_residual` when trying to
1542*9b92b1d3SBarry Smithlearn why a method is not converging!
1543*9b92b1d3SBarry Smith:::
1544*9b92b1d3SBarry Smith
1545*9b92b1d3SBarry SmithCommon reasons for KSP not converging are:
1546*9b92b1d3SBarry Smith
1547*9b92b1d3SBarry Smith- A symmetric method is being used for a non-symmetric problem.
1548*9b92b1d3SBarry Smith
1549*9b92b1d3SBarry Smith- The equations are singular by accident (e.g. forgot to impose boundary
1550*9b92b1d3SBarry Smith  conditions). Check this for a small problem using `-pc_type svd -pc_svd_monitor`.
1551*9b92b1d3SBarry Smith
1552*9b92b1d3SBarry Smith- The equations are intentionally singular (e.g. constant null space), but the Krylov
1553*9b92b1d3SBarry Smith  method was not informed, see `MatSetNullSpace()`. Always inform your local Krylov
1554*9b92b1d3SBarry Smith  subspace solver of any change of singularity. Failure to do so will result in the
1555*9b92b1d3SBarry Smith  immediate revocation of your computing and keyboard operator licenses, as well as
1556*9b92b1d3SBarry Smith  a stern talking-to by the nearest Krylov Subspace Method representative.
1557*9b92b1d3SBarry Smith
1558*9b92b1d3SBarry Smith- The equations are intentionally singular and `MatSetNullSpace()` was used, but the
1559*9b92b1d3SBarry Smith  right-hand side is not consistent. You may have to call `MatNullSpaceRemove()` on the
1560*9b92b1d3SBarry Smith  right-hand side before calling `KSPSolve()`. See `MatSetTransposeNullSpace()`.
1561*9b92b1d3SBarry Smith
1562*9b92b1d3SBarry Smith- The equations are indefinite so that standard preconditioners don't work. Usually you
1563*9b92b1d3SBarry Smith  will know this from the physics, but you can check with
1564*9b92b1d3SBarry Smith
1565*9b92b1d3SBarry Smith  ```text
1566*9b92b1d3SBarry Smith  -ksp_compute_eigenvalues -ksp_gmres_restart 1000 -pc_type none
1567*9b92b1d3SBarry Smith  ```
1568*9b92b1d3SBarry Smith
1569*9b92b1d3SBarry Smith  For simple saddle point problems, try
1570*9b92b1d3SBarry Smith
1571*9b92b1d3SBarry Smith  ```text
1572*9b92b1d3SBarry Smith  -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_detect_saddle_point
1573*9b92b1d3SBarry Smith  ```
1574*9b92b1d3SBarry Smith
1575*9b92b1d3SBarry Smith  For more difficult problems, read the literature to find robust methods and ask
1576*9b92b1d3SBarry Smith  <mailto:petsc-users@mcs.anl.gov> or <mailto:petsc-maint@mcs.anl.gov> if you want advice about how to
1577*9b92b1d3SBarry Smith  implement them.
1578*9b92b1d3SBarry Smith
1579*9b92b1d3SBarry Smith- If the method converges in preconditioned residual, but not in true residual, the
1580*9b92b1d3SBarry Smith  preconditioner is likely singular or nearly so. This is common for saddle point problems
1581*9b92b1d3SBarry Smith  (e.g. incompressible flow) or strongly nonsymmetric operators (e.g. low-Mach hyperbolic
1582*9b92b1d3SBarry Smith  problems with large time steps).
1583*9b92b1d3SBarry Smith
1584*9b92b1d3SBarry Smith- The preconditioner is too weak or is unstable. See if `-pc_type asm -sub_pc_type lu`
1585*9b92b1d3SBarry Smith  improves the convergence rate. If GMRES is losing too much progress in the restart, see
1586*9b92b1d3SBarry Smith  if longer restarts help `-ksp_gmres_restart 300`. If a transpose is available, try
1587*9b92b1d3SBarry Smith  `-ksp_type bcgs` or other methods that do not require a restart.
1588*9b92b1d3SBarry Smith
1589*9b92b1d3SBarry Smith  :::{note}
1590*9b92b1d3SBarry Smith  Unfortunately convergence with these methods is frequently erratic.
1591*9b92b1d3SBarry Smith  :::
1592*9b92b1d3SBarry Smith
1593*9b92b1d3SBarry Smith- The preconditioner is nonlinear (e.g. a nested iterative solve), try `-ksp_type
1594*9b92b1d3SBarry Smith  fgmres` or `-ksp_type gcr`.
1595*9b92b1d3SBarry Smith
1596*9b92b1d3SBarry Smith- You are using geometric multigrid, but some equations (often boundary conditions) are
1597*9b92b1d3SBarry Smith  not scaled compatibly between levels. Try `-pc_mg_galerkin` both to algebraically
1598*9b92b1d3SBarry Smith  construct a correctly scaled coarse operator or make sure that all the equations are
1599*9b92b1d3SBarry Smith  scaled in the same way if you want to use rediscretized coarse levels.
1600*9b92b1d3SBarry Smith
1601*9b92b1d3SBarry Smith- The matrix is very ill-conditioned. Check the {ref}`condition number <doc_faq_usage_condnum>`.
1602*9b92b1d3SBarry Smith
1603*9b92b1d3SBarry Smith  - Try to improve it by choosing the relative scaling of components/boundary conditions.
1604*9b92b1d3SBarry Smith  - Try `-ksp_diagonal_scale -ksp_diagonal_scale_fix`.
1605*9b92b1d3SBarry Smith  - Perhaps change the formulation of the problem to produce more friendly algebraic
1606*9b92b1d3SBarry Smith    equations.
1607*9b92b1d3SBarry Smith
1608*9b92b1d3SBarry Smith- Change the units (nondimensionalization), boundary condition scaling, or formulation so
1609*9b92b1d3SBarry Smith  that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and
1610*9b92b1d3SBarry Smith  Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127).
1611*9b92b1d3SBarry Smith
1612*9b92b1d3SBarry Smith- Classical Gram-Schmidt is becoming unstable, try `-ksp_gmres_modifiedgramschmidt` or
1613*9b92b1d3SBarry Smith  use a method that orthogonalizes differently, e.g. `-ksp_type gcr`.
1614*9b92b1d3SBarry Smith
1615*9b92b1d3SBarry Smith### I get the error message: Actual argument at (1) to assumed-type dummy is of derived type with type-bound or FINAL procedures
1616*9b92b1d3SBarry Smith
1617*9b92b1d3SBarry SmithUse the following code-snippet:
1618*9b92b1d3SBarry Smith
1619*9b92b1d3SBarry Smith```fortran
1620*9b92b1d3SBarry Smithmodule context_module
1621*9b92b1d3SBarry Smith#include petsc/finclude/petsc.h
1622*9b92b1d3SBarry Smithuse petsc
1623*9b92b1d3SBarry Smithimplicit none
1624*9b92b1d3SBarry Smithprivate
1625*9b92b1d3SBarry Smithtype, public ::  context_type
1626*9b92b1d3SBarry Smith  private
1627*9b92b1d3SBarry Smith  PetscInt :: foo
1628*9b92b1d3SBarry Smithcontains
1629*9b92b1d3SBarry Smith  procedure, public :: init => context_init
1630*9b92b1d3SBarry Smithend type context_type
1631*9b92b1d3SBarry Smithcontains
1632*9b92b1d3SBarry Smithsubroutine context_init(self, foo)
1633*9b92b1d3SBarry Smith  class(context_type), intent(in out) :: self
1634*9b92b1d3SBarry Smith  PetscInt, intent(in) :: foo
1635*9b92b1d3SBarry Smith  self%foo = foo
1636*9b92b1d3SBarry Smithend subroutine context_init
1637*9b92b1d3SBarry Smithend module context_module
1638*9b92b1d3SBarry Smith
1639*9b92b1d3SBarry Smith!------------------------------------------------------------------------
1640*9b92b1d3SBarry Smith
1641*9b92b1d3SBarry Smithprogram test_snes
1642*9b92b1d3SBarry Smithuse,intrinsic :: iso_c_binding
1643*9b92b1d3SBarry Smithuse petsc
1644*9b92b1d3SBarry Smithuse context_module
1645*9b92b1d3SBarry Smithimplicit none
1646*9b92b1d3SBarry Smith
1647*9b92b1d3SBarry SmithSNES :: snes
1648*9b92b1d3SBarry Smithtype(context_type),target :: context
1649*9b92b1d3SBarry Smithtype(c_ptr) :: contextOut
1650*9b92b1d3SBarry SmithPetscErrorCode :: ierr
1651*9b92b1d3SBarry Smith
1652*9b92b1d3SBarry Smithcall PetscInitialize(PETSC_NULL_CHARACTER, ierr)
1653*9b92b1d3SBarry Smithcall SNESCreate(PETSC_COMM_WORLD, snes, ierr)
1654*9b92b1d3SBarry Smithcall context%init(1)
1655*9b92b1d3SBarry Smith
1656*9b92b1d3SBarry SmithcontextOut = c_loc(context) ! contextOut is a C pointer on context
1657*9b92b1d3SBarry Smith
1658*9b92b1d3SBarry Smithcall SNESSetConvergenceTest(snes, convergence, contextOut, PETSC_NULL_FUNCTION, ierr)
1659*9b92b1d3SBarry Smith
1660*9b92b1d3SBarry Smithcall SNESDestroy(snes, ierr)
1661*9b92b1d3SBarry Smithcall PetscFinalize(ierr)
1662*9b92b1d3SBarry Smith
1663*9b92b1d3SBarry Smithcontains
1664*9b92b1d3SBarry Smith
1665*9b92b1d3SBarry Smithsubroutine convergence(snes, num_iterations, xnorm, pnorm,fnorm, reason, contextIn, ierr)
1666*9b92b1d3SBarry SmithSNES, intent(in) :: snes
1667*9b92b1d3SBarry Smith
1668*9b92b1d3SBarry SmithPetscInt, intent(in) :: num_iterations
1669*9b92b1d3SBarry SmithPetscReal, intent(in) :: xnorm, pnorm, fnorm
1670*9b92b1d3SBarry SmithSNESConvergedReason, intent(out) :: reason
1671*9b92b1d3SBarry Smithtype(c_ptr), intent(in out) :: contextIn
1672*9b92b1d3SBarry Smithtype(context_type), pointer :: context
1673*9b92b1d3SBarry SmithPetscErrorCode, intent(out) :: ierr
1674*9b92b1d3SBarry Smith
1675*9b92b1d3SBarry Smithcall c_f_pointer(contextIn,context)  ! convert the C pointer to a Fortran pointer to use context as in the main program
1676*9b92b1d3SBarry Smithreason = 0
1677*9b92b1d3SBarry Smithierr = 0
1678*9b92b1d3SBarry Smithend subroutine convergence
1679*9b92b1d3SBarry Smithend program test_snes
1680*9b92b1d3SBarry Smith```
1681*9b92b1d3SBarry Smith
1682*9b92b1d3SBarry Smith### In C++ I get a crash on `VecDestroy()` (or some other PETSc object) at the end of the program
1683*9b92b1d3SBarry Smith
1684*9b92b1d3SBarry SmithThis can happen when the destructor for a C++ class is automatically called at the end of
1685*9b92b1d3SBarry Smiththe program after `PetscFinalize()`. Use the following code-snippet:
1686*9b92b1d3SBarry Smith
1687*9b92b1d3SBarry Smith```
1688*9b92b1d3SBarry Smithmain()
1689*9b92b1d3SBarry Smith{
1690*9b92b1d3SBarry Smith  PetscCall(PetscInitialize());
1691*9b92b1d3SBarry Smith  {
1692*9b92b1d3SBarry Smith    your variables
1693*9b92b1d3SBarry Smith    your code
1694*9b92b1d3SBarry Smith
1695*9b92b1d3SBarry Smith    ...   /* all your destructors are called here automatically by C++ so they work correctly */
1696*9b92b1d3SBarry Smith  }
1697*9b92b1d3SBarry Smith  PetscCall(PetscFinalize());
1698*9b92b1d3SBarry Smith  return 0
1699*9b92b1d3SBarry Smith}
1700*9b92b1d3SBarry Smith```
1701*9b92b1d3SBarry Smith
1702*9b92b1d3SBarry Smith______________________________________________________________________
1703*9b92b1d3SBarry Smith
1704*9b92b1d3SBarry Smith## Debugging
1705*9b92b1d3SBarry Smith
1706*9b92b1d3SBarry Smith### What does the message hwloc/linux: Ignoring PCI device with non-16bit domain mean?
1707*9b92b1d3SBarry Smith
1708*9b92b1d3SBarry SmithThis is printed by the hwloc library that is used by some MPI implementations. It can be ignored.
1709*9b92b1d3SBarry SmithTo prevent the message from always being printed set the environmental variable `HWLOC_HIDE_ERRORS` to 2.
1710*9b92b1d3SBarry SmithFor example
1711*9b92b1d3SBarry Smith
1712*9b92b1d3SBarry Smith```
1713*9b92b1d3SBarry Smithexport HWLOC_HIDE_ERRORS=2
1714*9b92b1d3SBarry Smith```
1715*9b92b1d3SBarry Smith
1716*9b92b1d3SBarry Smithwhich can be added to your login profile file such as `~/.bashrc`.
1717*9b92b1d3SBarry Smith
1718*9b92b1d3SBarry Smith### How do I turn off PETSc signal handling so I can use the `-C` Option On `xlf`?
1719*9b92b1d3SBarry Smith
1720*9b92b1d3SBarry SmithImmediately after calling `PetscInitialize()` call `PetscPopSignalHandler()`.
1721*9b92b1d3SBarry Smith
1722*9b92b1d3SBarry SmithSome Fortran compilers including the IBM xlf, xlF etc compilers have a compile option
1723*9b92b1d3SBarry Smith(`-C` for IBM's) that causes all array access in Fortran to be checked that they are
1724*9b92b1d3SBarry Smithin-bounds. This is a great feature but does require that the array dimensions be set
1725*9b92b1d3SBarry Smithexplicitly, not with a \*.
1726*9b92b1d3SBarry Smith
1727*9b92b1d3SBarry Smith### How do I debug if `-start_in_debugger` does not work on my machine?
1728*9b92b1d3SBarry Smith
1729*9b92b1d3SBarry SmithThe script <https://github.com/Azrael3000/tmpi> can be used to run multiple MPI
1730*9b92b1d3SBarry Smithranks in the debugger using tmux.
1731*9b92b1d3SBarry Smith
1732*9b92b1d3SBarry SmithOn newer macOS machines - one has to be in admin group to be able to use debugger.
1733*9b92b1d3SBarry Smith
1734*9b92b1d3SBarry SmithOn newer Ubuntu linux machines - one has to disable `ptrace_scope` with
1735*9b92b1d3SBarry Smith
1736*9b92b1d3SBarry Smith```console
1737*9b92b1d3SBarry Smith$ sudo echo 0 > /proc/sys/kernel/yama/ptrace_scope
1738*9b92b1d3SBarry Smith```
1739*9b92b1d3SBarry Smith
1740*9b92b1d3SBarry Smithto get start in debugger working.
1741*9b92b1d3SBarry Smith
1742*9b92b1d3SBarry SmithIf `-start_in_debugger` does not work on your OS, for a uniprocessor job, just
1743*9b92b1d3SBarry Smithtry the debugger directly, for example: `gdb ex1`. You can also use [TotalView](https://totalview.io/products/totalview) which is a good graphical parallel debugger.
1744*9b92b1d3SBarry Smith
1745*9b92b1d3SBarry Smith### How do I see where my code is hanging?
1746*9b92b1d3SBarry Smith
1747*9b92b1d3SBarry SmithYou can use the `-start_in_debugger` option to start all processes in the debugger (each
1748*9b92b1d3SBarry Smithwill come up in its own xterm) or run in [TotalView](https://totalview.io/products/totalview). Then use `cont` (for continue) in each
1749*9b92b1d3SBarry Smithxterm. Once you are sure that the program is hanging, hit control-c in each xterm and then
1750*9b92b1d3SBarry Smithuse 'where' to print a stack trace for each process.
1751*9b92b1d3SBarry Smith
1752*9b92b1d3SBarry Smith### How can I inspect PETSc vector and matrix values when in the debugger?
1753*9b92b1d3SBarry Smith
1754*9b92b1d3SBarry SmithI will illustrate this with `gdb`, but it should be similar on other debuggers. You can
1755*9b92b1d3SBarry Smithlook at local `Vec` values directly by obtaining the array. For a `Vec` v, we can
1756*9b92b1d3SBarry Smithprint all local values using:
1757*9b92b1d3SBarry Smith
1758*9b92b1d3SBarry Smith```console
1759*9b92b1d3SBarry Smith(gdb) p ((Vec_Seq*) v->data)->array[0]@v->map.n
1760*9b92b1d3SBarry Smith```
1761*9b92b1d3SBarry Smith
1762*9b92b1d3SBarry SmithHowever, this becomes much more complicated for a matrix. Therefore, it is advisable to use the default viewer to look at the object. For a `Vec` v and a `Mat` m, this would be:
1763*9b92b1d3SBarry Smith
1764*9b92b1d3SBarry Smith```console
1765*9b92b1d3SBarry Smith(gdb) call VecView(v, 0)
1766*9b92b1d3SBarry Smith(gdb) call MatView(m, 0)
1767*9b92b1d3SBarry Smith```
1768*9b92b1d3SBarry Smith
1769*9b92b1d3SBarry Smithor with a communicator other than `MPI_COMM_WORLD`:
1770*9b92b1d3SBarry Smith
1771*9b92b1d3SBarry Smith```console
1772*9b92b1d3SBarry Smith(gdb) call MatView(m, PETSC_VIEWER_STDOUT_(m->comm))
1773*9b92b1d3SBarry Smith```
1774*9b92b1d3SBarry Smith
1775*9b92b1d3SBarry SmithTotalview 8.8.0+ has a new feature that allows libraries to provide their own code to
1776*9b92b1d3SBarry Smithdisplay objects in the debugger. Thus in theory each PETSc object, `Vec`, `Mat` etc
1777*9b92b1d3SBarry Smithcould have custom code to print values in the object. We have only done this for the most
1778*9b92b1d3SBarry Smithelementary display of `Vec` and `Mat`. See the routine `TV_display_type()` in
1779*9b92b1d3SBarry Smith`src/vec/vec/interface/vector.c` for an example of how these may be written. Contact us
1780*9b92b1d3SBarry Smithif you would like to add more.
1781*9b92b1d3SBarry Smith
1782*9b92b1d3SBarry Smith### How can I find the cause of floating point exceptions like not-a-number (NaN) or infinity?
1783*9b92b1d3SBarry Smith
1784*9b92b1d3SBarry SmithThe best way to locate floating point exceptions is to use a debugger. On supported
1785*9b92b1d3SBarry Smitharchitectures (including Linux and glibc-based systems), just run in a debugger and pass
1786*9b92b1d3SBarry Smith`-fp_trap` to the PETSc application. This will activate signaling exceptions and the
1787*9b92b1d3SBarry Smithdebugger will break on the line that first divides by zero or otherwise generates an
1788*9b92b1d3SBarry Smithexceptions.
1789*9b92b1d3SBarry Smith
1790*9b92b1d3SBarry SmithWithout a debugger, running with `-fp_trap` in debug mode will only identify the
1791*9b92b1d3SBarry Smithfunction in which the error occurred, but not the line or the type of exception. If
1792*9b92b1d3SBarry Smith`-fp_trap` is not supported on your architecture, consult the documentation for your
1793*9b92b1d3SBarry Smithdebugger since there is likely a way to have it catch exceptions.
1794*9b92b1d3SBarry Smith
1795*9b92b1d3SBarry Smith(error_libimf)=
1796*9b92b1d3SBarry Smith
1797*9b92b1d3SBarry Smith### Error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory
1798*9b92b1d3SBarry Smith
1799*9b92b1d3SBarry SmithThe Intel compilers use shared libraries (like libimf) that cannot be found, by default, at run
1800*9b92b1d3SBarry Smithtime. When using the Intel compilers (and running the resulting code) you must make sure
1801*9b92b1d3SBarry Smiththat the proper Intel initialization scripts are run. This is usually done by adding some
1802*9b92b1d3SBarry Smithcode into your `.cshrc`, `.bashrc`, `.profile` etc file. Sometimes on batch file
1803*9b92b1d3SBarry Smithsystems that do now access your initialization files (like .cshrc) you must include the
1804*9b92b1d3SBarry Smithinitialization calls in your batch file submission.
1805*9b92b1d3SBarry Smith
1806*9b92b1d3SBarry SmithFor example, on my Mac using `csh` I have the following in my `.cshrc` file:
1807*9b92b1d3SBarry Smith
1808*9b92b1d3SBarry Smith```csh
1809*9b92b1d3SBarry Smithsource /opt/intel/cc/10.1.012/bin/iccvars.csh
1810*9b92b1d3SBarry Smithsource /opt/intel/fc/10.1.012/bin/ifortvars.csh
1811*9b92b1d3SBarry Smithsource /opt/intel/idb/10.1.012/bin/idbvars.csh
1812*9b92b1d3SBarry Smith```
1813*9b92b1d3SBarry Smith
1814*9b92b1d3SBarry SmithAnd in my `.profile` I have
1815*9b92b1d3SBarry Smith
1816*9b92b1d3SBarry Smith```csh
1817*9b92b1d3SBarry Smithsource /opt/intel/cc/10.1.012/bin/iccvars.sh
1818*9b92b1d3SBarry Smithsource /opt/intel/fc/10.1.012/bin/ifortvars.sh
1819*9b92b1d3SBarry Smithsource /opt/intel/idb/10.1.012/bin/idbvars.sh
1820*9b92b1d3SBarry Smith```
1821*9b92b1d3SBarry Smith
1822*9b92b1d3SBarry Smith(object_type_not_set)=
1823*9b92b1d3SBarry Smith
1824*9b92b1d3SBarry Smith### What does "Object Type Not Set: Argument # N" Mean?
1825*9b92b1d3SBarry Smith
1826*9b92b1d3SBarry SmithMany operations on PETSc objects require that the specific type of the object be set before the operations is performed. You must call `XXXSetType()` or `XXXSetFromOptions()` before you make the offending call. For example
1827*9b92b1d3SBarry Smith
1828*9b92b1d3SBarry Smith```
1829*9b92b1d3SBarry SmithMat A;
1830*9b92b1d3SBarry Smith
1831*9b92b1d3SBarry SmithPetscCall(MatCreate(PETSC_COMM_WORLD, &A));
1832*9b92b1d3SBarry SmithPetscCall(MatSetValues(A,....));
1833*9b92b1d3SBarry Smith```
1834*9b92b1d3SBarry Smith
1835*9b92b1d3SBarry Smithwill not work. You must add `MatSetType()` or `MatSetFromOptions()` before the call to `MatSetValues()`. I.e.
1836*9b92b1d3SBarry Smith
1837*9b92b1d3SBarry Smith```
1838*9b92b1d3SBarry SmithMat A;
1839*9b92b1d3SBarry Smith
1840*9b92b1d3SBarry SmithPetscCall(MatCreate(PETSC_COMM_WORLD, &A));
1841*9b92b1d3SBarry Smith
1842*9b92b1d3SBarry SmithPetscCall(MatSetType(A, MATAIJ));
1843*9b92b1d3SBarry Smith/* Will override MatSetType() */
1844*9b92b1d3SBarry SmithPetscCall(MatSetFromOptions());
1845*9b92b1d3SBarry Smith
1846*9b92b1d3SBarry SmithPetscCall(MatSetValues(A,....));
1847*9b92b1d3SBarry Smith```
1848*9b92b1d3SBarry Smith
1849*9b92b1d3SBarry Smith(split_ownership)=
1850*9b92b1d3SBarry Smith
1851*9b92b1d3SBarry Smith### What does error detected in `PetscSplitOwnership()` about "sum of local lengths ...": mean?
1852*9b92b1d3SBarry Smith
1853*9b92b1d3SBarry SmithIn a previous call to `VecSetSizes()`, `MatSetSizes()`, `VecCreateXXX()` or
1854*9b92b1d3SBarry Smith`MatCreateXXX()` you passed in local and global sizes that do not make sense for the
1855*9b92b1d3SBarry Smithcorrect number of processors. For example if you pass in a local size of 2 and a global
1856*9b92b1d3SBarry Smithsize of 100 and run on two processors, this cannot work since the sum of the local sizes
1857*9b92b1d3SBarry Smithis 4, not 100.
1858*9b92b1d3SBarry Smith
1859*9b92b1d3SBarry Smith(valgrind)=
1860*9b92b1d3SBarry Smith
1861*9b92b1d3SBarry Smith### What does "corrupt argument" or "caught signal" Or "SEGV" Or "segmentation violation" Or "bus error" mean? Can I use Valgrind or CUDA-Memcheck to debug memory corruption issues?
1862*9b92b1d3SBarry Smith
1863*9b92b1d3SBarry SmithSometimes it can mean an argument to a function is invalid. In Fortran this may be caused
1864*9b92b1d3SBarry Smithby forgetting to list an argument in the call, especially the final `PetscErrorCode`.
1865*9b92b1d3SBarry Smith
1866*9b92b1d3SBarry SmithOtherwise it is usually caused by memory corruption; that is somewhere the code is writing
1867*9b92b1d3SBarry Smithout of array bounds. To track this down rerun the debug version of the code with the
1868*9b92b1d3SBarry Smithoption `-malloc_debug`. Occasionally the code may crash only with the optimized version,
1869*9b92b1d3SBarry Smithin that case run the optimized version with `-malloc_debug`. If you determine the
1870*9b92b1d3SBarry Smithproblem is from memory corruption you can put the macro CHKMEMQ in the code near the crash
1871*9b92b1d3SBarry Smithto determine exactly what line is causing the problem.
1872*9b92b1d3SBarry Smith
1873*9b92b1d3SBarry SmithIf `-malloc_debug` does not help: on NVIDIA CUDA systems you can use <https://docs.nvidia.com/cuda/cuda-memcheck/index.html>
1874*9b92b1d3SBarry Smith
1875*9b92b1d3SBarry SmithIf `-malloc_debug` does not help: on GNU/Linux (not macOS machines) - you can
1876*9b92b1d3SBarry Smithuse [valgrind](http://valgrind.org). Follow the below instructions:
1877*9b92b1d3SBarry Smith
1878*9b92b1d3SBarry Smith1. `configure` PETSc with `--download-mpich --with-debugging` (you can use other MPI implementations but most produce spurious Valgrind messages)
1879*9b92b1d3SBarry Smith
1880*9b92b1d3SBarry Smith2. Compile your application code with this build of PETSc.
1881*9b92b1d3SBarry Smith
1882*9b92b1d3SBarry Smith3. Run with Valgrind.
1883*9b92b1d3SBarry Smith
1884*9b92b1d3SBarry Smith   ```console
1885*9b92b1d3SBarry Smith   $ $PETSC_DIR/lib/petsc/bin/petscmpiexec -valgrind -n NPROC PETSCPROGRAMNAME PROGRAMOPTIONS
1886*9b92b1d3SBarry Smith   ```
1887*9b92b1d3SBarry Smith
1888*9b92b1d3SBarry Smith   or
1889*9b92b1d3SBarry Smith
1890*9b92b1d3SBarry Smith   ```console
1891*9b92b1d3SBarry Smith   $ mpiexec -n NPROC valgrind --tool=memcheck -q --num-callers=20 \
1892*9b92b1d3SBarry Smith   --suppressions=$PETSC_DIR/share/petsc/suppressions/valgrind \
1893*9b92b1d3SBarry Smith   --log-file=valgrind.log.%p PETSCPROGRAMNAME -malloc off PROGRAMOPTIONS
1894*9b92b1d3SBarry Smith   ```
1895*9b92b1d3SBarry Smith
1896*9b92b1d3SBarry Smith:::{note}
1897*9b92b1d3SBarry Smith- option `--with-debugging` enables valgrind to give stack trace with additional
1898*9b92b1d3SBarry Smith  source-file:line-number info.
1899*9b92b1d3SBarry Smith- option `--download-mpich` is valgrind clean, other MPI builds are not valgrind clean.
1900*9b92b1d3SBarry Smith- when `--download-mpich` is used - mpiexec will be in `$PETSC_ARCH/bin`
1901*9b92b1d3SBarry Smith- `--log-file=valgrind.log.%p` option tells valgrind to store the output from each
1902*9b92b1d3SBarry Smith  process in a different file \[as %p i.e PID, is different for each MPI process.
1903*9b92b1d3SBarry Smith- `memcheck` will not find certain array access that violate static array
1904*9b92b1d3SBarry Smith  declarations so if memcheck runs clean you can try the `--tool=exp-ptrcheck`
1905*9b92b1d3SBarry Smith  instead.
1906*9b92b1d3SBarry Smith:::
1907*9b92b1d3SBarry Smith
1908*9b92b1d3SBarry SmithYou might also consider using <http://drmemory.org> which has support for GNU/Linux, Apple
1909*9b92b1d3SBarry SmithMac OS and Microsoft Windows machines. (Note we haven't tried this ourselves).
1910*9b92b1d3SBarry Smith
1911*9b92b1d3SBarry Smith(zeropivot)=
1912*9b92b1d3SBarry Smith
1913*9b92b1d3SBarry Smith### What does "detected zero pivot in LU factorization" mean?
1914*9b92b1d3SBarry Smith
1915*9b92b1d3SBarry SmithA zero pivot in LU, ILU, Cholesky, or ICC sparse factorization does not always mean that
1916*9b92b1d3SBarry Smiththe matrix is singular. You can use
1917*9b92b1d3SBarry Smith
1918*9b92b1d3SBarry Smith```text
1919*9b92b1d3SBarry Smith-pc_factor_shift_type nonzero -pc_factor_shift_amount [amount]
1920*9b92b1d3SBarry Smith```
1921*9b92b1d3SBarry Smith
1922*9b92b1d3SBarry Smithor
1923*9b92b1d3SBarry Smith
1924*9b92b1d3SBarry Smith```text
1925*9b92b1d3SBarry Smith-pc_factor_shift_type positive_definite -[level]_pc_factor_shift_type nonzero
1926*9b92b1d3SBarry Smith -pc_factor_shift_amount [amount]
1927*9b92b1d3SBarry Smith```
1928*9b92b1d3SBarry Smith
1929*9b92b1d3SBarry Smithor
1930*9b92b1d3SBarry Smith
1931*9b92b1d3SBarry Smith```text
1932*9b92b1d3SBarry Smith-[level]_pc_factor_shift_type positive_definite
1933*9b92b1d3SBarry Smith```
1934*9b92b1d3SBarry Smith
1935*9b92b1d3SBarry Smithto prevent the zero pivot. [level] is "sub" when lu, ilu, cholesky, or icc are employed in
1936*9b92b1d3SBarry Smitheach individual block of the bjacobi or ASM preconditioner. [level] is "mg_levels" or
1937*9b92b1d3SBarry Smith"mg_coarse" when lu, ilu, cholesky, or icc are used inside multigrid smoothers or to the
1938*9b92b1d3SBarry Smithcoarse grid solver. See `PCFactorSetShiftType()`, `PCFactorSetShiftAmount()`.
1939*9b92b1d3SBarry Smith
1940*9b92b1d3SBarry SmithThis error can also happen if your matrix is singular, see `MatSetNullSpace()` for how
1941*9b92b1d3SBarry Smithto handle this. If this error occurs in the zeroth row of the matrix, it is likely you
1942*9b92b1d3SBarry Smithhave an error in the code that generates the matrix.
1943*9b92b1d3SBarry Smith
1944*9b92b1d3SBarry Smith### You create draw windows or `PETSCVIEWERDRAW` windows or use options `-ksp_monitor draw::draw_lg` or `-snes_monitor draw::draw_lg` and the program seems to run OK but windows never open
1945*9b92b1d3SBarry Smith
1946*9b92b1d3SBarry SmithThe libraries were compiled without support for X windows. Make sure that `configure`
1947*9b92b1d3SBarry Smithwas run with the option `--with-x`.
1948*9b92b1d3SBarry Smith
1949*9b92b1d3SBarry Smith### The program seems to use more and more memory as it runs, even though you don't think you are allocating more memory
1950*9b92b1d3SBarry Smith
1951*9b92b1d3SBarry SmithSome of the following may be occurring:
1952*9b92b1d3SBarry Smith
1953*9b92b1d3SBarry Smith- You are creating new PETSc objects but never freeing them.
1954*9b92b1d3SBarry Smith- There is a memory leak in PETSc or your code.
1955*9b92b1d3SBarry Smith- Something much more subtle: (if you are using Fortran). When you declare a large array
1956*9b92b1d3SBarry Smith  in Fortran, the operating system does not allocate all the memory pages for that array
1957*9b92b1d3SBarry Smith  until you start using the different locations in the array. Thus, in a code, if at each
1958*9b92b1d3SBarry Smith  step you start using later values in the array your virtual memory usage will "continue"
1959*9b92b1d3SBarry Smith  to increase as measured by `ps` or `top`.
1960*9b92b1d3SBarry Smith- You are running with the `-log`, `-log_mpe`, or `-log_all` option. With these
1961*9b92b1d3SBarry Smith  options, a great deal of logging information is stored in memory until the conclusion of
1962*9b92b1d3SBarry Smith  the run.
1963*9b92b1d3SBarry Smith- You are linking with the MPI profiling libraries; these cause logging of all MPI
1964*9b92b1d3SBarry Smith  activities. Another symptom is at the conclusion of the run it may print some message
1965*9b92b1d3SBarry Smith  about writing log files.
1966*9b92b1d3SBarry Smith
1967*9b92b1d3SBarry SmithThe following may help:
1968*9b92b1d3SBarry Smith
1969*9b92b1d3SBarry Smith- Run with the `-malloc_debug` option and `-malloc_view`. Or use `PetscMallocDump()`
1970*9b92b1d3SBarry Smith  and `PetscMallocView()` sprinkled about your code to track memory that is allocated
1971*9b92b1d3SBarry Smith  and not later freed. Use the commands `PetscMallocGetCurrentUsage()` and
1972*9b92b1d3SBarry Smith  `PetscMemoryGetCurrentUsage()` to monitor memory allocated and
1973*9b92b1d3SBarry Smith  `PetscMallocGetMaximumUsage()` and `PetscMemoryGetMaximumUsage()` for total memory
1974*9b92b1d3SBarry Smith  used as the code progresses.
1975*9b92b1d3SBarry Smith- This is just the way Unix works and is harmless.
1976*9b92b1d3SBarry Smith- Do not use the `-log`, `-log_mpe`, or `-log_all` option, or use
1977*9b92b1d3SBarry Smith  `PetscLogEventDeactivate()` or `PetscLogEventDeactivateClass()` to turn off logging of
1978*9b92b1d3SBarry Smith  specific events.
1979*9b92b1d3SBarry Smith- Make sure you do not link with the MPI profiling libraries.
1980*9b92b1d3SBarry Smith
1981*9b92b1d3SBarry Smith### When calling `MatPartitioningApply()` you get a message "Error! Key 16615 Not Found"
1982*9b92b1d3SBarry Smith
1983*9b92b1d3SBarry SmithThe graph of the matrix you are using is not symmetric. You must use symmetric matrices
1984*9b92b1d3SBarry Smithfor partitioning.
1985*9b92b1d3SBarry Smith
1986*9b92b1d3SBarry Smith### With GMRES at restart the second residual norm printed does not match the first
1987*9b92b1d3SBarry Smith
1988*9b92b1d3SBarry SmithI.e.
1989*9b92b1d3SBarry Smith
1990*9b92b1d3SBarry Smith```text
1991*9b92b1d3SBarry Smith26 KSP Residual norm 3.421544615851e-04
1992*9b92b1d3SBarry Smith27 KSP Residual norm 2.973675659493e-04
1993*9b92b1d3SBarry Smith28 KSP Residual norm 2.588642948270e-04
1994*9b92b1d3SBarry Smith29 KSP Residual norm 2.268190747349e-04
1995*9b92b1d3SBarry Smith30 KSP Residual norm 1.977245964368e-04
1996*9b92b1d3SBarry Smith30 KSP Residual norm 1.994426291979e-04 <----- At restart the residual norm is printed a second time
1997*9b92b1d3SBarry Smith```
1998*9b92b1d3SBarry Smith
1999*9b92b1d3SBarry SmithThis is actually not surprising! GMRES computes the norm of the residual at each iteration
2000*9b92b1d3SBarry Smithvia a recurrence relation between the norms of the residuals at the previous iterations
2001*9b92b1d3SBarry Smithand quantities computed at the current iteration. It does not compute it via directly
2002*9b92b1d3SBarry Smith$|| b - A x^{n} ||$.
2003*9b92b1d3SBarry Smith
2004*9b92b1d3SBarry SmithSometimes, especially with an ill-conditioned matrix, or computation of the matrix-vector
2005*9b92b1d3SBarry Smithproduct via differencing, the residual norms computed by GMRES start to "drift" from the
2006*9b92b1d3SBarry Smithcorrect values. At the restart, we compute the residual norm directly, hence the "strange
2007*9b92b1d3SBarry Smithstuff," the difference printed. The drifting, if it remains small, is harmless (doesn't
2008*9b92b1d3SBarry Smithaffect the accuracy of the solution that GMRES computes).
2009*9b92b1d3SBarry Smith
2010*9b92b1d3SBarry SmithIf you use a more powerful preconditioner the drift will often be smaller and less
2011*9b92b1d3SBarry Smithnoticeable. Of if you are running matrix-free you may need to tune the matrix-free
2012*9b92b1d3SBarry Smithparameters.
2013*9b92b1d3SBarry Smith
2014*9b92b1d3SBarry Smith### Why do some Krylov methods seem to print two residual norms per iteration?
2015*9b92b1d3SBarry Smith
2016*9b92b1d3SBarry SmithI.e.
2017*9b92b1d3SBarry Smith
2018*9b92b1d3SBarry Smith```text
2019*9b92b1d3SBarry Smith1198 KSP Residual norm 1.366052062216e-04
2020*9b92b1d3SBarry Smith1198 KSP Residual norm 1.931875025549e-04
2021*9b92b1d3SBarry Smith1199 KSP Residual norm 1.366026406067e-04
2022*9b92b1d3SBarry Smith1199 KSP Residual norm 1.931819426344e-04
2023*9b92b1d3SBarry Smith```
2024*9b92b1d3SBarry Smith
2025*9b92b1d3SBarry SmithSome Krylov methods, for example `KSPTFQMR`, actually have a "sub-iteration" of size 2
2026*9b92b1d3SBarry Smithinside the loop. Each of the two substeps has its own matrix vector product and
2027*9b92b1d3SBarry Smithapplication of the preconditioner and updates the residual approximations. This is why you
2028*9b92b1d3SBarry Smithget this "funny" output where it looks like there are two residual norms per
2029*9b92b1d3SBarry Smithiteration. You can also think of it as twice as many iterations.
2030*9b92b1d3SBarry Smith
2031*9b92b1d3SBarry Smith### Unable to locate PETSc dynamic library `libpetsc`
2032*9b92b1d3SBarry Smith
2033*9b92b1d3SBarry SmithWhen using DYNAMIC libraries - the libraries cannot be moved after they are
2034*9b92b1d3SBarry Smithinstalled. This could also happen on clusters - where the paths are different on the (run)
2035*9b92b1d3SBarry Smithnodes - than on the (compile) front-end. **Do not use dynamic libraries & shared
2036*9b92b1d3SBarry Smithlibraries**. Run `configure` with
2037*9b92b1d3SBarry Smith`--with-shared-libraries=0 --with-dynamic-loading=0`.
2038*9b92b1d3SBarry Smith
2039*9b92b1d3SBarry Smith:::{important}
2040*9b92b1d3SBarry SmithThis option has been removed in petsc v3.5
2041*9b92b1d3SBarry Smith:::
2042*9b92b1d3SBarry Smith
2043*9b92b1d3SBarry Smith### How do I determine what update to PETSc broke my code?
2044*9b92b1d3SBarry Smith
2045*9b92b1d3SBarry SmithIf at some point (in PETSc code history) you had a working code - but the latest PETSc
2046*9b92b1d3SBarry Smithcode broke it, its possible to determine the PETSc code change that might have caused this
2047*9b92b1d3SBarry Smithbehavior. This is achieved by:
2048*9b92b1d3SBarry Smith
2049*9b92b1d3SBarry Smith- Using Git to access PETSc sources
2050*9b92b1d3SBarry Smith- Knowing the Git commit for the known working version of PETSc
2051*9b92b1d3SBarry Smith- Knowing the Git commit for the known broken version of PETSc
2052*9b92b1d3SBarry Smith- Using the [bisect](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html)
2053*9b92b1d3SBarry Smith  functionality of Git
2054*9b92b1d3SBarry Smith
2055*9b92b1d3SBarry SmithGit bisect can be done as follows:
2056*9b92b1d3SBarry Smith
2057*9b92b1d3SBarry Smith1. Get PETSc development (main branch in git) sources
2058*9b92b1d3SBarry Smith
2059*9b92b1d3SBarry Smith   ```console
2060*9b92b1d3SBarry Smith   $ git clone https://gitlab.com/petsc/petsc.git
2061*9b92b1d3SBarry Smith   ```
2062*9b92b1d3SBarry Smith
2063*9b92b1d3SBarry Smith2. Find the good and bad markers to start the bisection process. This can be done either
2064*9b92b1d3SBarry Smith   by checking `git log` or `gitk` or <https://gitlab.com/petsc/petsc> or the web
2065*9b92b1d3SBarry Smith   history of petsc-release clones. Lets say the known bad commit is 21af4baa815c and
2066*9b92b1d3SBarry Smith   known good commit is 5ae5ab319844.
2067*9b92b1d3SBarry Smith
2068*9b92b1d3SBarry Smith3. Start the bisection process with these known revisions. Build PETSc, and test your code
2069*9b92b1d3SBarry Smith   to confirm known good/bad behavior:
2070*9b92b1d3SBarry Smith
2071*9b92b1d3SBarry Smith   ```console
2072*9b92b1d3SBarry Smith   $ git bisect start 21af4baa815c 5ae5ab319844
2073*9b92b1d3SBarry Smith   ```
2074*9b92b1d3SBarry Smith
2075*9b92b1d3SBarry Smith   build/test, perhaps discover that this new state is bad
2076*9b92b1d3SBarry Smith
2077*9b92b1d3SBarry Smith   ```console
2078*9b92b1d3SBarry Smith   $ git bisect bad
2079*9b92b1d3SBarry Smith   ```
2080*9b92b1d3SBarry Smith
2081*9b92b1d3SBarry Smith   build/test, perhaps discover that this state is good
2082*9b92b1d3SBarry Smith
2083*9b92b1d3SBarry Smith   ```console
2084*9b92b1d3SBarry Smith   $ git bisect good
2085*9b92b1d3SBarry Smith   ```
2086*9b92b1d3SBarry Smith
2087*9b92b1d3SBarry Smith   Now until done - keep bisecting, building PETSc, and testing your code with it and
2088*9b92b1d3SBarry Smith   determine if the code is working or not. After something like 5-15 iterations, `git
2089*9b92b1d3SBarry Smith   bisect` will pin-point the exact code change that resulted in the difference in
2090*9b92b1d3SBarry Smith   application behavior.
2091*9b92b1d3SBarry Smith
2092*9b92b1d3SBarry Smith:::{tip}
2093*9b92b1d3SBarry SmithSee [git-bisect(1)](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html) and the
2094*9b92b1d3SBarry Smith[debugging section of the Git Book](https://git-scm.com/book/en/Git-Tools-Debugging-with-Git) for more debugging tips.
2095*9b92b1d3SBarry Smith:::
2096*9b92b1d3SBarry Smith
2097*9b92b1d3SBarry Smith### How to fix the error "PMIX Error: error in file gds_ds12_lock_pthread.c"?
2098*9b92b1d3SBarry Smith
2099*9b92b1d3SBarry SmithThis seems to be an error when using Open MPI and OpenBLAS with threads (or perhaps other
2100*9b92b1d3SBarry Smithpackages that use threads).
2101*9b92b1d3SBarry Smith
2102*9b92b1d3SBarry Smith```console
2103*9b92b1d3SBarry Smith$ export PMIX_MCA_gds=hash
2104*9b92b1d3SBarry Smith```
2105*9b92b1d3SBarry Smith
2106*9b92b1d3SBarry SmithShould resolve the problem.
2107*9b92b1d3SBarry Smith
2108*9b92b1d3SBarry Smith______________________________________________________________________
2109*9b92b1d3SBarry Smith
2110*9b92b1d3SBarry Smith(doc_faq_sharedlibs)=
2111*9b92b1d3SBarry Smith
2112*9b92b1d3SBarry Smith## Shared Libraries
2113*9b92b1d3SBarry Smith
2114*9b92b1d3SBarry Smith### Can I install PETSc libraries as shared libraries?
2115*9b92b1d3SBarry Smith
2116*9b92b1d3SBarry SmithYes. Use
2117*9b92b1d3SBarry Smith
2118*9b92b1d3SBarry Smith```console
2119*9b92b1d3SBarry Smith$ ./configure --with-shared-libraries
2120*9b92b1d3SBarry Smith```
2121*9b92b1d3SBarry Smith
2122*9b92b1d3SBarry Smith### Why should I use shared libraries?
2123*9b92b1d3SBarry Smith
2124*9b92b1d3SBarry SmithWhen you link to shared libraries, the function symbols from the shared libraries are not
2125*9b92b1d3SBarry Smithcopied in the executable. This way the size of the executable is considerably smaller than
2126*9b92b1d3SBarry Smithwhen using regular libraries. This helps in a couple of ways:
2127*9b92b1d3SBarry Smith
2128*9b92b1d3SBarry Smith- Saves disk space when more than one executable is created
2129*9b92b1d3SBarry Smith- Improves the compile time immensely, because the compiler has to write a much smaller
2130*9b92b1d3SBarry Smith  file (executable) to the disk.
2131*9b92b1d3SBarry Smith
2132*9b92b1d3SBarry Smith### How do I link to the PETSc shared libraries?
2133*9b92b1d3SBarry Smith
2134*9b92b1d3SBarry SmithBy default, the compiler should pick up the shared libraries instead of the regular
2135*9b92b1d3SBarry Smithones. Nothing special should be done for this.
2136*9b92b1d3SBarry Smith
2137*9b92b1d3SBarry Smith### What if I want to link to the regular `.a` library files?
2138*9b92b1d3SBarry Smith
2139*9b92b1d3SBarry SmithYou must run `configure` with the option `--with-shared-libraries=0` (you can use a
2140*9b92b1d3SBarry Smithdifferent `$PETSC_ARCH` for this build so you can easily switch between the two).
2141*9b92b1d3SBarry Smith
2142*9b92b1d3SBarry Smith### What do I do if I want to move my executable to a different machine?
2143*9b92b1d3SBarry Smith
2144*9b92b1d3SBarry SmithYou would also need to have access to the shared libraries on this new machine. The other
2145*9b92b1d3SBarry Smithalternative is to build the executable without shared libraries by first deleting the
2146*9b92b1d3SBarry Smithshared libraries, and then creating the executable.
2147*9b92b1d3SBarry Smith
2148*9b92b1d3SBarry Smith```{bibliography} /petsc.bib
2149*9b92b1d3SBarry Smith:filter: docname in docnames
2150*9b92b1d3SBarry Smith```
2151