xref: /petsc/doc/faq/index.md (revision 7f296bb328fcd4c99f2da7bfe8ba7ed8a4ebceee)
19b92b1d3SBarry Smith(doc_faq)=
29b92b1d3SBarry Smith
39b92b1d3SBarry Smith# FAQ
49b92b1d3SBarry Smith
59b92b1d3SBarry Smith```{contents} Table Of Contents
69b92b1d3SBarry Smith:backlinks: top
79b92b1d3SBarry Smith:local: true
89b92b1d3SBarry Smith```
99b92b1d3SBarry Smith
109b92b1d3SBarry Smith______________________________________________________________________
119b92b1d3SBarry Smith
129b92b1d3SBarry Smith## General
139b92b1d3SBarry Smith
149b92b1d3SBarry Smith### How can I subscribe to the PETSc mailing lists?
159b92b1d3SBarry Smith
169b92b1d3SBarry SmithSee mailing list {ref}`documentation <doc_mail>`
179b92b1d3SBarry Smith
189b92b1d3SBarry Smith### Any useful books on numerical computing?
199b92b1d3SBarry Smith
209b92b1d3SBarry Smith[Bueler, PETSc for Partial Differential Equations: Numerical Solutions in C and Python](https://my.siam.org/Store/Product/viewproduct/?ProductId=32850137)
219b92b1d3SBarry Smith
229b92b1d3SBarry Smith[Oliveira and Stewart, Writing Scientific Software: A Guide to Good Style](https://www.cambridge.org/core/books/writing-scientific-software/23206704175AF868E43FE3FB399C2F53)
239b92b1d3SBarry Smith
249b92b1d3SBarry Smith(doc_faq_general_parallel)=
259b92b1d3SBarry Smith
269b92b1d3SBarry Smith### What kind of parallel computers or clusters are needed to use PETSc? Or why do I get little speedup?
279b92b1d3SBarry Smith
289b92b1d3SBarry Smith:::{important}
299b92b1d3SBarry SmithPETSc can be used with any kind of parallel system that supports MPI BUT for any decent
309b92b1d3SBarry Smithperformance one needs:
319b92b1d3SBarry Smith
329b92b1d3SBarry Smith- Fast, **low-latency** interconnect; any ethernet (even 10 GigE) simply cannot provide
339b92b1d3SBarry Smith  the needed performance.
349b92b1d3SBarry Smith- High per-core **memory** performance. Each core needs to
359b92b1d3SBarry Smith  have its **own** memory bandwidth of at least 2 or more gigabytes/second. Most modern
369b92b1d3SBarry Smith  computers are not bottlenecked by how fast they can perform
379b92b1d3SBarry Smith  calculations; rather, they are usually restricted by how quickly they can get their
389b92b1d3SBarry Smith  data.
399b92b1d3SBarry Smith:::
409b92b1d3SBarry Smith
419b92b1d3SBarry SmithTo obtain good performance it is important that you know your machine! I.e. how many
429b92b1d3SBarry Smithcompute nodes (generally, how many motherboards), how many memory sockets per node and how
439b92b1d3SBarry Smithmany cores per memory socket and how much memory bandwidth for each.
449b92b1d3SBarry Smith
459b92b1d3SBarry SmithIf you do not know this and can run MPI programs with mpiexec (that is, you don't have
469b92b1d3SBarry Smithbatch system), run the following from `$PETSC_DIR`:
479b92b1d3SBarry Smith
489b92b1d3SBarry Smith```console
499b92b1d3SBarry Smith$ make streams [NPMAX=maximum_number_of_mpi_processes_you_plan_to_use]
509b92b1d3SBarry Smith```
519b92b1d3SBarry Smith
529b92b1d3SBarry SmithThis will provide a summary of the bandwidth with different number of MPI
539b92b1d3SBarry Smithprocesses and potential speedups. See {any}`ch_streams` for a detailed discussion.
549b92b1d3SBarry Smith
559b92b1d3SBarry SmithIf you have a batch system:
569b92b1d3SBarry Smith
579b92b1d3SBarry Smith```console
589b92b1d3SBarry Smith$ cd $PETSC_DIR/src/benchmarks/streams
599b92b1d3SBarry Smith$ make MPIVersion
609b92b1d3SBarry Smithsubmit MPIVersion to the batch system a number of times with 1, 2, 3, etc. MPI processes
619b92b1d3SBarry Smithcollecting all of the output from the runs into the single file scaling.log. Copy
629b92b1d3SBarry Smithscaling.log into the src/benchmarks/streams directory.
639b92b1d3SBarry Smith$ ./process.py createfile ; process.py
649b92b1d3SBarry Smith```
659b92b1d3SBarry Smith
669b92b1d3SBarry SmithEven if you have enough memory bandwidth if the OS switches processes between cores
679b92b1d3SBarry Smithperformance can degrade. Smart process to core/socket binding (this just means locking a
689b92b1d3SBarry Smithprocess to a particular core or memory socket) may help you. For example, consider using
699b92b1d3SBarry Smithfewer processes than cores and binding processes to separate sockets so that each process
709b92b1d3SBarry Smithuses a different memory bus:
719b92b1d3SBarry Smith
729b92b1d3SBarry Smith- [MPICH2 binding with the Hydra process manager](https://github.com/pmodels/mpich/blob/main/doc/wiki/how_to/Using_the_Hydra_Process_Manager.md#process-core-binding]
739b92b1d3SBarry Smith
749b92b1d3SBarry Smith  ```console
759b92b1d3SBarry Smith  $ mpiexec.hydra -n 4 --binding cpu:sockets
769b92b1d3SBarry Smith  ```
779b92b1d3SBarry Smith
789b92b1d3SBarry Smith- [Open MPI binding](https://www.open-mpi.org/faq/?category=tuning#using-paffinity)
799b92b1d3SBarry Smith
809b92b1d3SBarry Smith  ```console
819b92b1d3SBarry Smith  $ mpiexec -n 4 --map-by socket --bind-to socket --report-bindings
829b92b1d3SBarry Smith  ```
839b92b1d3SBarry Smith
849b92b1d3SBarry Smith- `taskset`, part of the [util-linux](https://github.com/karelzak/util-linux) package
859b92b1d3SBarry Smith
869b92b1d3SBarry Smith  Check `man taskset` for details. Make sure to set affinity for **your** program,
879b92b1d3SBarry Smith  **not** for the `mpiexec` program.
889b92b1d3SBarry Smith
899b92b1d3SBarry Smith- `numactl`
909b92b1d3SBarry Smith
919b92b1d3SBarry Smith  In addition to task affinity, this tool also allows changing the default memory affinity
929b92b1d3SBarry Smith  policy. On Linux, the default policy is to attempt to find memory on the same memory bus
939b92b1d3SBarry Smith  that serves the core that a thread is running on when the memory is faulted
949b92b1d3SBarry Smith  (not when `malloc()` is called). If local memory is not available, it is found
959b92b1d3SBarry Smith  elsewhere, possibly leading to serious memory imbalances.
969b92b1d3SBarry Smith
979b92b1d3SBarry Smith  The option `--localalloc` allocates memory on the local NUMA node, similar to the
989b92b1d3SBarry Smith  `numa_alloc_local()` function in the `libnuma` library. The option
999b92b1d3SBarry Smith  `--cpunodebind=nodes` binds the process to a given NUMA node (note that this can be
1009b92b1d3SBarry Smith  larger or smaller than a CPU (socket); a NUMA node usually has multiple cores).
1019b92b1d3SBarry Smith
1029b92b1d3SBarry Smith  The option `--physcpubind=cpus` binds the process to a given processor core (numbered
1039b92b1d3SBarry Smith  according to `/proc/cpuinfo`, therefore including logical cores if Hyper-threading is
1049b92b1d3SBarry Smith  enabled).
1059b92b1d3SBarry Smith
1069b92b1d3SBarry Smith  With Open MPI, you can use knowledge of the NUMA hierarchy and core numbering on your
1079b92b1d3SBarry Smith  machine to calculate the correct NUMA node or processor number given the environment
1089b92b1d3SBarry Smith  variable `OMPI_COMM_WORLD_LOCAL_RANK`. In most cases, it is easier to make mpiexec or
1099b92b1d3SBarry Smith  a resource manager set affinities.
1109b92b1d3SBarry Smith
1119b92b1d3SBarry SmithThe software [Open-MX](http://open-mx.gforge.inria.fr) provides faster speed for
1129b92b1d3SBarry Smithethernet systems, we have not tried it but it claims it can dramatically reduce latency
1139b92b1d3SBarry Smithand increase bandwidth on Linux system. You must first install this software and then
1149b92b1d3SBarry Smithinstall MPICH or Open MPI to use it.
1159b92b1d3SBarry Smith
1169b92b1d3SBarry Smith### What kind of license is PETSc released under?
1179b92b1d3SBarry Smith
1189b92b1d3SBarry SmithSee licensing {ref}`documentation <doc_license>`
1199b92b1d3SBarry Smith
1209b92b1d3SBarry Smith### Why is PETSc written in C, instead of Fortran or C++?
1219b92b1d3SBarry Smith
1229b92b1d3SBarry SmithWhen this decision was made, in the early 1990s, C enabled us to build data structures
1239b92b1d3SBarry Smithfor storing sparse matrices, solver information,
1249b92b1d3SBarry Smithetc. in ways that Fortran simply did not allow. ANSI C was a complete standard that all
1259b92b1d3SBarry Smithmodern C compilers supported. The language was identical on all machines. C++ was still
1269b92b1d3SBarry Smithevolving and compilers on different machines were not identical. Using C function pointers
1279b92b1d3SBarry Smithto provide data encapsulation and polymorphism allowed us to get many of the advantages of
1289b92b1d3SBarry SmithC++ without using such a large and more complicated language. It would have been natural and
1299b92b1d3SBarry Smithreasonable to have coded PETSc in C++; we opted to use C instead.
1309b92b1d3SBarry Smith
1319b92b1d3SBarry Smith### Does all the PETSc error checking and logging reduce PETSc's efficiency?
1329b92b1d3SBarry Smith
1339b92b1d3SBarry SmithNo
1349b92b1d3SBarry Smith
1359b92b1d3SBarry Smith(doc_faq_maintenance_strats)=
1369b92b1d3SBarry Smith
1379b92b1d3SBarry Smith### How do such a small group of people manage to write and maintain such a large and marvelous package as PETSc?
1389b92b1d3SBarry Smith
1399b92b1d3SBarry Smith1. **We work very efficiently**.
1409b92b1d3SBarry Smith
1419b92b1d3SBarry Smith   - We use powerful editors and programming environments.
1429b92b1d3SBarry Smith   - Our manual pages are generated automatically from formatted comments in the code,
1439b92b1d3SBarry Smith     thus alleviating the need for creating and maintaining manual pages.
1449b92b1d3SBarry Smith   - We employ continuous integration testing of the entire PETSc library on many different
1459b92b1d3SBarry Smith     machine architectures. This process **significantly** protects (no bug-catching
1469b92b1d3SBarry Smith     process is perfect) against inadvertently introducing bugs with new additions. Every
1479b92b1d3SBarry Smith     new feature **must** pass our suite of thousands of tests as well as formal code
1489b92b1d3SBarry Smith     review before it may be included.
1499b92b1d3SBarry Smith
1509b92b1d3SBarry Smith2. **We are very careful in our design (and are constantly revising our design)**
1519b92b1d3SBarry Smith
1529b92b1d3SBarry Smith   - PETSc as a package should be easy to use, write, and maintain. Our mantra is to write
1539b92b1d3SBarry Smith     code like everyone is using it.
1549b92b1d3SBarry Smith
1559b92b1d3SBarry Smith3. **We are willing to do the grunt work**
1569b92b1d3SBarry Smith
1579b92b1d3SBarry Smith   - PETSc is regularly checked to make sure that all code conforms to our interface
1589b92b1d3SBarry Smith     design. We will never keep in a bad design decision simply because changing it will
1599b92b1d3SBarry Smith     require a lot of editing; we do a lot of editing.
1609b92b1d3SBarry Smith
1619b92b1d3SBarry Smith4. **We constantly seek out and experiment with new design ideas**
1629b92b1d3SBarry Smith
1639b92b1d3SBarry Smith   - We retain the useful ones and discard the rest. All of these decisions are based not
1649b92b1d3SBarry Smith     just on performance, but also on **practicality**.
1659b92b1d3SBarry Smith
1669b92b1d3SBarry Smith5. **Function and variable names must adhere to strict guidelines**
1679b92b1d3SBarry Smith
1689b92b1d3SBarry Smith   - Even the rules about capitalization are designed to make it easy to figure out the
1699b92b1d3SBarry Smith     name of a particular object or routine. Our memories are terrible, so careful
1709b92b1d3SBarry Smith     consistent naming puts less stress on our limited human RAM.
1719b92b1d3SBarry Smith
1729b92b1d3SBarry Smith6. **The PETSc directory tree is designed to make it easy to move throughout the
1739b92b1d3SBarry Smith   entire package**
1749b92b1d3SBarry Smith
1759b92b1d3SBarry Smith7. **We have a rich, robust, and fast bug reporting system**
1769b92b1d3SBarry Smith
1779b92b1d3SBarry Smith   - <mailto:petsc-maint@mcs.anl.gov> is always checked, and we pride ourselves on responding
1789b92b1d3SBarry Smith     quickly and accurately. Email is very lightweight, and so bug reports system retains
1799b92b1d3SBarry Smith     an archive of all reported problems and fixes, so it is easy to re-find fixes to
1809b92b1d3SBarry Smith     previously discovered problems.
1819b92b1d3SBarry Smith
1829b92b1d3SBarry Smith8. **We contain the complexity of PETSc by using powerful object-oriented programming
1839b92b1d3SBarry Smith   techniques**
1849b92b1d3SBarry Smith
1859b92b1d3SBarry Smith   - Data encapsulation serves to abstract complex data formats or movement to
1869b92b1d3SBarry Smith     human-readable format. This is why your program cannot, for example, look directly
1879b92b1d3SBarry Smith     at what is inside the object `Mat`.
1889b92b1d3SBarry Smith   - Polymorphism makes changing program behavior as easy as possible, and further
1899b92b1d3SBarry Smith     abstracts the *intent* of your program from what is *written* in code. You call
1909b92b1d3SBarry Smith     `MatMult()` regardless of whether your matrix is dense, sparse, parallel or
1919b92b1d3SBarry Smith     sequential; you don't call a different routine for each format.
1929b92b1d3SBarry Smith
1939b92b1d3SBarry Smith9. **We try to provide the functionality requested by our users**
1949b92b1d3SBarry Smith
1959b92b1d3SBarry Smith### For complex numbers will I get better performance with C++?
1969b92b1d3SBarry Smith
1979b92b1d3SBarry SmithTo use PETSc with complex numbers you may use the following `configure` options;
1989b92b1d3SBarry Smith`--with-scalar-type=complex` and either `--with-clanguage=c++` or (the default)
1999b92b1d3SBarry Smith`--with-clanguage=c`. In our experience they will deliver very similar performance
2009b92b1d3SBarry Smith(speed), but if one is concerned they should just try both and see if one is faster.
2019b92b1d3SBarry Smith
2029b92b1d3SBarry Smith### How come when I run the same program on the same number of processes I get a "different" answer?
2039b92b1d3SBarry Smith
2049b92b1d3SBarry SmithInner products and norms in PETSc are computed using the `MPI_Allreduce()` command. For
2059b92b1d3SBarry Smithdifferent runs the order at which values arrive at a given process (via MPI) can be in a
2069b92b1d3SBarry Smithdifferent order, thus the order in which some floating point arithmetic operations are
2079b92b1d3SBarry Smithperformed will be different. Since floating point arithmetic is not
2089b92b1d3SBarry Smithassociative, the computed quantity may be slightly different.
2099b92b1d3SBarry Smith
2109b92b1d3SBarry SmithOver a run the many slight differences in the inner products and norms will effect all the
2119b92b1d3SBarry Smithcomputed results. It is important to realize that none of the computed answers are any
2129b92b1d3SBarry Smithless right or wrong (in fact the sequential computation is no more right then the parallel
2139b92b1d3SBarry Smithones). All answers are equal, but some are more equal than others.
2149b92b1d3SBarry Smith
2159b92b1d3SBarry SmithThe discussion above assumes that the exact same algorithm is being used on the different
2169b92b1d3SBarry Smithnumber of processes. When the algorithm is different for the different number of processes
2179b92b1d3SBarry Smith(almost all preconditioner algorithms except Jacobi are different for different number of
2189b92b1d3SBarry Smithprocesses) then one expects to see (and does) a greater difference in results for
2199b92b1d3SBarry Smithdifferent numbers of processes. In some cases (for example block Jacobi preconditioner) it
2209b92b1d3SBarry Smithmay be that the algorithm works for some number of processes and does not work for others.
2219b92b1d3SBarry Smith
2229b92b1d3SBarry Smith### How come when I run the same linear solver on a different number of processes it takes a different number of iterations?
2239b92b1d3SBarry Smith
2249b92b1d3SBarry SmithThe convergence of many of the preconditioners in PETSc including the default parallel
2259b92b1d3SBarry Smithpreconditioner block Jacobi depends on the number of processes. The more processes the
2269b92b1d3SBarry Smith(slightly) slower convergence it has. This is the nature of iterative solvers, the more
2279b92b1d3SBarry Smithparallelism means the more "older" information is used in the solution process hence
2289b92b1d3SBarry Smithslower convergence.
2299b92b1d3SBarry Smith
2309b92b1d3SBarry Smith(doc_faq_gpuhowto)=
2319b92b1d3SBarry Smith
2329b92b1d3SBarry Smith### Can PETSc use GPUs to speed up computations?
2339b92b1d3SBarry Smith
2349b92b1d3SBarry Smith:::{seealso}
2359b92b1d3SBarry SmithSee GPU development {ref}`roadmap <doc_gpu_roadmap>` for the latest information
2369b92b1d3SBarry Smithregarding the state of PETSc GPU integration.
2379b92b1d3SBarry Smith
2389b92b1d3SBarry SmithSee GPU install {ref}`documentation <doc_config_accel>` for up-to-date information on
2399b92b1d3SBarry Smithinstalling PETSc to use GPU's.
2409b92b1d3SBarry Smith:::
2419b92b1d3SBarry Smith
2429b92b1d3SBarry SmithQuick summary of usage with CUDA:
2439b92b1d3SBarry Smith
2449b92b1d3SBarry Smith- The `VecType` `VECSEQCUDA`, `VECMPICUDA`, or `VECCUDA` may be used with
2459b92b1d3SBarry Smith  `VecSetType()` or `-vec_type seqcuda`, `mpicuda`, or `cuda` when
2469b92b1d3SBarry Smith  `VecSetFromOptions()` is used.
2479b92b1d3SBarry Smith- The `MatType` `MATSEQAIJCUSPARSE`, `MATMPIAIJCUSPARSE`, or `MATAIJCUSPARSE`
2489b92b1d3SBarry Smith  may be used with `MatSetType()` or `-mat_type seqaijcusparse`, `mpiaijcusparse`, or
2499b92b1d3SBarry Smith  `aijcusparse` when `MatSetFromOptions()` is used.
2509b92b1d3SBarry Smith- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type
2519b92b1d3SBarry Smith  cuda` and `-dm_mat_type aijcusparse`.
2529b92b1d3SBarry Smith
2539b92b1d3SBarry SmithQuick summary of usage with OpenCL (provided by the ViennaCL library):
2549b92b1d3SBarry Smith
2559b92b1d3SBarry Smith- The `VecType` `VECSEQVIENNACL`, `VECMPIVIENNACL`, or `VECVIENNACL` may be used
2569b92b1d3SBarry Smith  with `VecSetType()` or `-vec_type seqviennacl`, `mpiviennacl`, or `viennacl`
2579b92b1d3SBarry Smith  when `VecSetFromOptions()` is used.
2589b92b1d3SBarry Smith- The `MatType` `MATSEQAIJVIENNACL`, `MATMPIAIJVIENNACL`, or `MATAIJVIENNACL`
2599b92b1d3SBarry Smith  may be used with `MatSetType()` or `-mat_type seqaijviennacl`, `mpiaijviennacl`, or
2609b92b1d3SBarry Smith  `aijviennacl` when `MatSetFromOptions()` is used.
2619b92b1d3SBarry Smith- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type
2629b92b1d3SBarry Smith  viennacl` and `-dm_mat_type aijviennacl`.
2639b92b1d3SBarry Smith
2649b92b1d3SBarry SmithGeneral hints:
2659b92b1d3SBarry Smith
2669b92b1d3SBarry Smith- It is useful to develop your code with the default vectors and then run production runs
2679b92b1d3SBarry Smith  with the command line options to use the GPU since debugging on GPUs is difficult.
2689b92b1d3SBarry Smith- All of the Krylov methods except `KSPIBCGS` run on the GPU.
2699b92b1d3SBarry Smith- Parts of most preconditioners run directly on the GPU. After setup, `PCGAMG` runs
2709b92b1d3SBarry Smith  fully on GPUs, without any memory copies between the CPU and GPU.
2719b92b1d3SBarry Smith
2729b92b1d3SBarry SmithSome GPU systems (for example many laptops) only run with single precision; thus, PETSc
2739b92b1d3SBarry Smithmust be built with the `configure` option `--with-precision=single`.
2749b92b1d3SBarry Smith
2759b92b1d3SBarry Smith(doc_faq_extendedprecision)=
2769b92b1d3SBarry Smith
2779b92b1d3SBarry Smith### Can I run PETSc with extended precision?
2789b92b1d3SBarry Smith
2799b92b1d3SBarry SmithYes, with gcc and gfortran. `configure` PETSc using the
2809b92b1d3SBarry Smithoptions `--with-precision=__float128` and `--download-f2cblaslapack`.
2819b92b1d3SBarry Smith
2829b92b1d3SBarry Smith:::{admonition} Warning
2839b92b1d3SBarry Smith:class: yellow
2849b92b1d3SBarry Smith
2859b92b1d3SBarry SmithExternal packages are not guaranteed to work in this mode!
2869b92b1d3SBarry Smith:::
2879b92b1d3SBarry Smith
2889b92b1d3SBarry Smith### Why doesn't PETSc use Qd to implement support for extended precision?
2899b92b1d3SBarry Smith
2909b92b1d3SBarry SmithWe tried really hard but could not. The problem is that the QD c++ classes, though they
2919b92b1d3SBarry Smithtry to, implement the built-in data types of `double` are not native types and cannot
2929b92b1d3SBarry Smith"just be used" in a general piece of numerical source code. Ratherm the code has to
2939b92b1d3SBarry Smithrewritten to live within the limitations of QD classes. However PETSc can be built to use
2949b92b1d3SBarry Smithquad precision, as detailed {ref}`here <doc_faq_extendedprecision>`.
2959b92b1d3SBarry Smith
2969b92b1d3SBarry Smith### How do I cite PETSc?
2979b92b1d3SBarry Smith
2989b92b1d3SBarry SmithUse {any}`these citations <doc_index_citing_petsc>`.
2999b92b1d3SBarry Smith
3009b92b1d3SBarry Smith______________________________________________________________________
3019b92b1d3SBarry Smith
3029b92b1d3SBarry Smith## Installation
3039b92b1d3SBarry Smith
3049b92b1d3SBarry Smith### How do I begin using PETSc if the software has already been completely built and installed by someone else?
3059b92b1d3SBarry Smith
3069b92b1d3SBarry SmithAssuming that the PETSc libraries have been successfully built for a particular
3079b92b1d3SBarry Smitharchitecture and level of optimization, a new user must merely:
3089b92b1d3SBarry Smith
3099b92b1d3SBarry Smith1. Set `PETSC_DIR` to the full path of the PETSc home
3109b92b1d3SBarry Smith   directory. This will be the location of the `configure` script, and usually called
3119b92b1d3SBarry Smith   "petsc" or some variation of that (for example, /home/username/petsc).
3129b92b1d3SBarry Smith2. Set `PETSC_ARCH`, which indicates the configuration on which PETSc will be
3139b92b1d3SBarry Smith   used. Note that `$PETSC_ARCH` is simply a name the installer used when installing
3149b92b1d3SBarry Smith   the libraries. There will exist a directory within `$PETSC_DIR` that is named after
3159b92b1d3SBarry Smith   its corresponding `$PETSC_ARCH`. There many be several on a single system, for
3169b92b1d3SBarry Smith   example "linux-c-debug" for the debug versions compiled by a C compiler or
3179b92b1d3SBarry Smith   "linux-c-opt" for the optimized version.
3189b92b1d3SBarry Smith
3199b92b1d3SBarry Smith:::{admonition} Still Stuck?
3209b92b1d3SBarry SmithSee the {ref}`quick-start tutorial <tut_install>` for a step-by-step guide on
3219b92b1d3SBarry Smithinstalling PETSc, in case you have missed a step.
3229b92b1d3SBarry Smith
323*7f296bb3SBarry SmithSee the users manual section on {ref}`getting started <sec_getting_started>`.
3249b92b1d3SBarry Smith:::
3259b92b1d3SBarry Smith
3269b92b1d3SBarry Smith### The PETSc distribution is SO Large. How can I reduce my disk space usage?
3279b92b1d3SBarry Smith
3289b92b1d3SBarry Smith1. The PETSc users manual is provided in PDF format at `$PETSC_DIR/manual.pdf`. You
3299b92b1d3SBarry Smith   can delete this.
3309b92b1d3SBarry Smith2. The PETSc test suite contains sample output for many of the examples. These are
3319b92b1d3SBarry Smith   contained in the PETSc directories `$PETSC_DIR/src/*/tutorials/output` and
3329b92b1d3SBarry Smith   `$PETSC_DIR/src/*/tests/output`. Once you have run the test examples, you may remove
3339b92b1d3SBarry Smith   all of these directories to save some disk space. You can locate the largest with
3349b92b1d3SBarry Smith   e.g. `find . -name output -type d | xargs du -sh | sort -hr` on a Unix-based system.
3359b92b1d3SBarry Smith3. The debugging versions of the libraries are larger than the optimized versions. In a
3369b92b1d3SBarry Smith   pinch you can work with the optimized version, although we bid you good luck in
3379b92b1d3SBarry Smith   finding bugs as it is much easier with the debug version.
3389b92b1d3SBarry Smith
3399b92b1d3SBarry Smith### I want to use PETSc only for uniprocessor programs. Must I still install and use a version of MPI?
3409b92b1d3SBarry Smith
3419b92b1d3SBarry SmithNo, run `configure` with the option `--with-mpi=0`
3429b92b1d3SBarry Smith
3439b92b1d3SBarry Smith### Can I install PETSc to not use X windows (either under Unix or Microsoft Windows with GCC)?
3449b92b1d3SBarry Smith
3459b92b1d3SBarry SmithYes. Run `configure` with the additional flag `--with-x=0`
3469b92b1d3SBarry Smith
3479b92b1d3SBarry Smith### Why do you use MPI?
3489b92b1d3SBarry Smith
3499b92b1d3SBarry SmithMPI is the message-passing standard. Because it is a standard, it will not frequently change over
3509b92b1d3SBarry Smithtime; thus, we do not have to change PETSc every time the provider of the message-passing
3519b92b1d3SBarry Smithsystem decides to make an interface change. MPI was carefully designed by experts from
3529b92b1d3SBarry Smithindustry, academia, and government labs to provide the highest quality performance and
3539b92b1d3SBarry Smithcapability.
3549b92b1d3SBarry Smith
3559b92b1d3SBarry SmithFor example, the careful design of communicators in MPI allows the easy nesting of
3569b92b1d3SBarry Smithdifferent libraries; no other message-passing system provides this support. All of the
3579b92b1d3SBarry Smithmajor parallel computer vendors were involved in the design of MPI and have committed to
3589b92b1d3SBarry Smithproviding quality implementations.
3599b92b1d3SBarry Smith
3609b92b1d3SBarry SmithIn addition, since MPI is a standard, several different groups have already provided
3619b92b1d3SBarry Smithcomplete free implementations. Thus, one does not have to rely on the technical skills of
3629b92b1d3SBarry Smithone particular group to provide the message-passing libraries. Today, MPI is the only
3639b92b1d3SBarry Smithpractical, portable approach to writing efficient parallel numerical software.
3649b92b1d3SBarry Smith
3659b92b1d3SBarry Smith(invalid_mpi_compilers)=
3669b92b1d3SBarry Smith
3679b92b1d3SBarry Smith### What do I do if my MPI compiler wrappers are invalid?
3689b92b1d3SBarry Smith
3699b92b1d3SBarry SmithMost MPI implementations provide compiler wrappers (such as `mpicc`) which give the
3709b92b1d3SBarry Smithinclude and link options necessary to use that version of MPI to the underlying compilers
3719b92b1d3SBarry Smith. Configuration will fail if these wrappers are either absent or broken in the MPI pointed to by
3729b92b1d3SBarry Smith`--with-mpi-dir`. You can rerun the configure with the additional option
3739b92b1d3SBarry Smith`--with-mpi-compilers=0`, which will try to auto-detect working compilers; however,
3749b92b1d3SBarry Smiththese compilers may be incompatible with the particular MPI build. If this fix does not
3759b92b1d3SBarry Smithwork, run with `--with-cc=[your_c_compiler]` where you know `your_c_compiler` works
3769b92b1d3SBarry Smithwith this particular MPI, and likewise for C++ (`--with-cxx=[your_cxx_compiler]`) and Fortran
3779b92b1d3SBarry Smith(`--with-fc=[your_fortran_compiler]`).
3789b92b1d3SBarry Smith
3799b92b1d3SBarry Smith(bit_indices)=
3809b92b1d3SBarry Smith
3819b92b1d3SBarry Smith### When should/can I use the `configure` option `--with-64-bit-indices`?
3829b92b1d3SBarry Smith
3839b92b1d3SBarry SmithBy default the type that PETSc uses to index into arrays and keep sizes of arrays is a
3849b92b1d3SBarry Smith`PetscInt` defined to be a 32-bit `int`. If your problem:
3859b92b1d3SBarry Smith
3869b92b1d3SBarry Smith- Involves more than 2^31 - 1 unknowns (around 2 billion).
3879b92b1d3SBarry Smith- Your matrix might contain more than 2^31 - 1 nonzeros on a single process.
3889b92b1d3SBarry Smith
3899b92b1d3SBarry SmithThen you need to use this option. Otherwise you will get strange crashes.
3909b92b1d3SBarry Smith
3919b92b1d3SBarry SmithThis option can be used when you are using either 32 or 64-bit pointers. You do not
3929b92b1d3SBarry Smithneed to use this option if you are using 64-bit pointers unless the two conditions above
3939b92b1d3SBarry Smithhold.
3949b92b1d3SBarry Smith
3959b92b1d3SBarry Smith### What if I get an internal compiler error?
3969b92b1d3SBarry Smith
3979b92b1d3SBarry SmithYou can rebuild the offending file individually with a lower optimization level. **Then
3989b92b1d3SBarry Smithmake sure to complain to the compiler vendor and file a bug report**. For example, if the
3999b92b1d3SBarry Smithcompiler chokes on `src/mat/impls/baij/seq/baijsolvtrannat.c` you can run the following
4009b92b1d3SBarry Smithfrom `$PETSC_DIR`:
4019b92b1d3SBarry Smith
4029b92b1d3SBarry Smith```console
4039b92b1d3SBarry Smith$ make -f gmakefile PCC_FLAGS="-O1" $PETSC_ARCH/obj/src/mat/impls/baij/seq/baijsolvtrannat.o
4049b92b1d3SBarry Smith$ make all
4059b92b1d3SBarry Smith```
4069b92b1d3SBarry Smith
4079b92b1d3SBarry Smith### How do I enable Python bindings (petsc4py) with PETSc?
4089b92b1d3SBarry Smith
4099b92b1d3SBarry Smith1. Install [Cython](https://cython.org/).
4109b92b1d3SBarry Smith2. `configure` PETSc with the `--with-petsc4py=1` option.
4119b92b1d3SBarry Smith3. set `PYTHONPATH=$PETSC_DIR/$PETSC_ARCH/lib`
4129b92b1d3SBarry Smith
4139b92b1d3SBarry Smith(macos_gfortran)=
4149b92b1d3SBarry Smith
4159b92b1d3SBarry Smith### What Fortran compiler do you recommend on macOS?
4169b92b1d3SBarry Smith
4179b92b1d3SBarry SmithWe recommend using [homebrew](https://brew.sh/) to install [gfortran](https://gcc.gnu.org/wiki/GFortran), see {any}`doc_macos_install`
4189b92b1d3SBarry Smith
4199b92b1d3SBarry Smith### How can I find the URL locations of the packages you install using `--download-PACKAGE`?
4209b92b1d3SBarry Smith
4219b92b1d3SBarry Smith```console
4229b92b1d3SBarry Smith$ grep "self.download " $PETSC_DIR/config/BuildSystem/config/packages/*.py
4239b92b1d3SBarry Smith```
4249b92b1d3SBarry Smith
4259b92b1d3SBarry Smith### How to fix the problem: PETSc was configured with one MPICH (or Open MPI) `mpi.h` version but now appears to be compiling using a different MPICH (or Open MPI) `mpi.h` version
4269b92b1d3SBarry Smith
4279b92b1d3SBarry SmithThis happens for generally one of two reasons:
4289b92b1d3SBarry Smith
4299b92b1d3SBarry Smith- You previously ran `configure` with the option `--download-mpich` (or `--download-openmpi`)
4309b92b1d3SBarry Smith  but later ran `configure` to use a version of MPI already installed on the
4319b92b1d3SBarry Smith  machine. Solution:
4329b92b1d3SBarry Smith
4339b92b1d3SBarry Smith  ```console
4349b92b1d3SBarry Smith  $ rm -rf $PETSC_DIR/$PETSC_ARCH
4359b92b1d3SBarry Smith  $ ./configure --your-args
4369b92b1d3SBarry Smith  ```
4379b92b1d3SBarry Smith
4389b92b1d3SBarry Smith(mpi_network_misconfigure)=
4399b92b1d3SBarry Smith
4409b92b1d3SBarry Smith### What does it mean when `make check` hangs or errors on `PetscOptionsInsertFile()`?
4419b92b1d3SBarry Smith
4429b92b1d3SBarry SmithFor example:
4439b92b1d3SBarry Smith
4449b92b1d3SBarry Smith```none
4459b92b1d3SBarry SmithPossible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes
4469b92b1d3SBarry SmithSee https://petsc.org/release/faq/
4479b92b1d3SBarry Smith[0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
4489b92b1d3SBarry Smith[0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
4499b92b1d3SBarry Smith[0]PETSC ERROR: #3 PetscInitialize() line 828 in /Users/barrysmith/Src/PETSc/src/sys/objects/pinit.c
4509b92b1d3SBarry Smith```
4519b92b1d3SBarry Smith
4529b92b1d3SBarry Smithor
4539b92b1d3SBarry Smith
4549b92b1d3SBarry Smith```none
4559b92b1d3SBarry Smith$ make check
4569b92b1d3SBarry SmithRunning check examples to verify correct installation
4579b92b1d3SBarry SmithUsing PETSC_DIR=/Users/barrysmith/Src/petsc and PETSC_ARCH=arch-fix-mpiexec-hang-2-ranks
4589b92b1d3SBarry SmithC/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
4599b92b1d3SBarry SmithPROGRAM SEEMS TO BE HANGING HERE
4609b92b1d3SBarry Smith```
4619b92b1d3SBarry Smith
4629b92b1d3SBarry SmithThis usually occurs when network settings are misconfigured (perhaps due to VPN) resulting in a failure or hang in system call `gethostbyname()`.
4639b92b1d3SBarry Smith
4649b92b1d3SBarry Smith- Verify you are using the correct `mpiexec` for the MPI you have linked PETSc with.
4659b92b1d3SBarry Smith
4669b92b1d3SBarry Smith- If you have a VPN enabled on your machine, try turning it off and then running `make check` to
4679b92b1d3SBarry Smith  verify that it is not the VPN playing poorly with MPI.
4689b92b1d3SBarry Smith
469*7f296bb3SBarry Smith- If  ``ping `hostname` `` (`/sbin/ping` on macOS) fails or hangs do:
4709b92b1d3SBarry Smith
4719b92b1d3SBarry Smith  ```none
4729b92b1d3SBarry Smith  echo 127.0.0.1 `hostname` | sudo tee -a /etc/hosts
4739b92b1d3SBarry Smith  ```
4749b92b1d3SBarry Smith
4759b92b1d3SBarry Smith  and try `make check` again.
4769b92b1d3SBarry Smith
4779b92b1d3SBarry Smith- Try completely disconnecting your machine from the network and see if `make check` then works
4789b92b1d3SBarry Smith
4799b92b1d3SBarry Smith- Try the PETSc `configure` option `--download-mpich-device=ch3:nemesis` with `--download-mpich`.
4809b92b1d3SBarry Smith
4819b92b1d3SBarry Smith______________________________________________________________________
4829b92b1d3SBarry Smith
4839b92b1d3SBarry Smith## Usage
4849b92b1d3SBarry Smith
4859b92b1d3SBarry Smith### How can I redirect PETSc's `stdout` and `stderr` when programming with a GUI interface in Windows Developer Studio or to C++ streams?
4869b92b1d3SBarry Smith
4879b92b1d3SBarry SmithTo overload just the error messages write your own `MyPrintError()` function that does
4889b92b1d3SBarry Smithwhatever you want (including pop up windows etc) and use it like below.
4899b92b1d3SBarry Smith
4909b92b1d3SBarry Smith```c
4919b92b1d3SBarry Smithextern "C" {
4929b92b1d3SBarry Smith  int PASCAL WinMain(HINSTANCE,HINSTANCE,LPSTR,int);
4939b92b1d3SBarry Smith};
4949b92b1d3SBarry Smith
4959b92b1d3SBarry Smith#include <petscsys.h>
4969b92b1d3SBarry Smith#include <mpi.h>
4979b92b1d3SBarry Smith
4989b92b1d3SBarry Smithconst char help[] = "Set up from main";
4999b92b1d3SBarry Smith
5009b92b1d3SBarry Smithint MyPrintError(const char error[], ...)
5019b92b1d3SBarry Smith{
5029b92b1d3SBarry Smith  printf("%s", error);
5039b92b1d3SBarry Smith  return 0;
5049b92b1d3SBarry Smith}
5059b92b1d3SBarry Smith
5069b92b1d3SBarry Smithint main(int ac, char *av[])
5079b92b1d3SBarry Smith{
5089b92b1d3SBarry Smith  char           buf[256];
5099b92b1d3SBarry Smith  HINSTANCE      inst;
5109b92b1d3SBarry Smith  PetscErrorCode ierr;
5119b92b1d3SBarry Smith
5129b92b1d3SBarry Smith  inst = (HINSTANCE)GetModuleHandle(NULL);
5139b92b1d3SBarry Smith  PetscErrorPrintf = MyPrintError;
5149b92b1d3SBarry Smith
5159b92b1d3SBarry Smith  buf[0] = 0;
5169b92b1d3SBarry Smith  for (int i = 1; i < ac; ++i) {
5179b92b1d3SBarry Smith    strcat(buf, av[i]);
5189b92b1d3SBarry Smith    strcat(buf, " ");
5199b92b1d3SBarry Smith  }
5209b92b1d3SBarry Smith
5219b92b1d3SBarry Smith  PetscCall(PetscInitialize(&ac, &av, NULL, help));
5229b92b1d3SBarry Smith
5239b92b1d3SBarry Smith  return WinMain(inst, NULL, buf, SW_SHOWNORMAL);
5249b92b1d3SBarry Smith}
5259b92b1d3SBarry Smith```
5269b92b1d3SBarry Smith
5279b92b1d3SBarry SmithPlace this file in the project and compile with this preprocessor definitions:
5289b92b1d3SBarry Smith
5299b92b1d3SBarry Smith```
5309b92b1d3SBarry SmithWIN32
5319b92b1d3SBarry Smith_DEBUG
5329b92b1d3SBarry Smith_CONSOLE
5339b92b1d3SBarry Smith_MBCS
5349b92b1d3SBarry SmithUSE_PETSC_LOG
5359b92b1d3SBarry SmithUSE_PETSC_BOPT_g
5369b92b1d3SBarry SmithUSE_PETSC_STACK
5379b92b1d3SBarry Smith_AFXDLL
5389b92b1d3SBarry Smith```
5399b92b1d3SBarry Smith
5409b92b1d3SBarry SmithAnd these link options:
5419b92b1d3SBarry Smith
5429b92b1d3SBarry Smith```
5439b92b1d3SBarry Smith/nologo
5449b92b1d3SBarry Smith/subsystem:console
5459b92b1d3SBarry Smith/incremental:yes
5469b92b1d3SBarry Smith/debug
5479b92b1d3SBarry Smith/machine:I386
5489b92b1d3SBarry Smith/nodefaultlib:"libcmtd.lib"
5499b92b1d3SBarry Smith/nodefaultlib:"libcd.lib"
5509b92b1d3SBarry Smith/nodefaultlib:"mvcrt.lib"
5519b92b1d3SBarry Smith/pdbtype:sept
5529b92b1d3SBarry Smith```
5539b92b1d3SBarry Smith
5549b92b1d3SBarry Smith:::{note}
5559b92b1d3SBarry SmithThe above is compiled and linked as if it was a console program. The linker will search
5569b92b1d3SBarry Smithfor a main, and then from it the `WinMain` will start. This works with MFC templates and
5579b92b1d3SBarry Smithderived classes too.
5589b92b1d3SBarry Smith
5599b92b1d3SBarry SmithWhen writing a Window's console application you do not need to do anything, the `stdout`
5609b92b1d3SBarry Smithand `stderr` is automatically output to the console window.
5619b92b1d3SBarry Smith:::
5629b92b1d3SBarry Smith
5639b92b1d3SBarry SmithTo change where all PETSc `stdout` and `stderr` go, (you can also reassign
5649b92b1d3SBarry Smith`PetscVFPrintf()` to handle `stdout` and `stderr` any way you like) write the
5659b92b1d3SBarry Smithfollowing function:
5669b92b1d3SBarry Smith
5679b92b1d3SBarry Smith```
5689b92b1d3SBarry SmithPetscErrorCode mypetscvfprintf(FILE *fd, const char format[], va_list Argp)
5699b92b1d3SBarry Smith{
5709b92b1d3SBarry Smith  PetscFunctionBegin;
5719b92b1d3SBarry Smith  if (fd != stdout && fd != stderr) { /* handle regular files */
5729b92b1d3SBarry Smith    PetscCall(PetscVFPrintfDefault(fd, format, Argp));
5739b92b1d3SBarry Smith  } else {
5749b92b1d3SBarry Smith    char buff[1024]; /* Make sure to assign a large enough buffer */
5759b92b1d3SBarry Smith    int  length;
5769b92b1d3SBarry Smith
5779b92b1d3SBarry Smith    PetscCall(PetscVSNPrintf(buff, 1024, format, &length, Argp));
5789b92b1d3SBarry Smith
5799b92b1d3SBarry Smith    /* now send buff to whatever stream or whatever you want */
5809b92b1d3SBarry Smith  }
5819b92b1d3SBarry Smith  PetscFunctionReturn(PETSC_SUCCESS);
5829b92b1d3SBarry Smith}
5839b92b1d3SBarry Smith```
5849b92b1d3SBarry Smith
5859b92b1d3SBarry SmithThen assign `PetscVFPrintf = mypetscprintf` before `PetscInitialize()` in your main
5869b92b1d3SBarry Smithprogram.
5879b92b1d3SBarry Smith
5889b92b1d3SBarry Smith### I want to use Hypre boomerAMG without GMRES but when I run `-pc_type hypre -pc_hypre_type boomeramg -ksp_type preonly` I don't get a very accurate answer!
5899b92b1d3SBarry Smith
5909b92b1d3SBarry SmithYou should run with `-ksp_type richardson` to have PETSc run several V or W
5919b92b1d3SBarry Smithcycles. `-ksp_type preonly` causes boomerAMG to use only one V/W cycle. You can control
5929b92b1d3SBarry Smithhow many cycles are used in a single application of the boomerAMG preconditioner with
5939b92b1d3SBarry Smith`-pc_hypre_boomeramg_max_iter <it>` (the default is 1). You can also control the
5949b92b1d3SBarry Smithtolerance boomerAMG uses to decide if to stop before `max_iter` with
5959b92b1d3SBarry Smith`-pc_hypre_boomeramg_tol <tol>` (the default is 1.e-7). Run with `-ksp_view` to see
5969b92b1d3SBarry Smithall the hypre options used and `-help | grep boomeramg` to see all the command line
5979b92b1d3SBarry Smithoptions.
5989b92b1d3SBarry Smith
5999b92b1d3SBarry Smith### How do I use PETSc for Domain Decomposition?
6009b92b1d3SBarry Smith
6019b92b1d3SBarry SmithPETSc includes Additive Schwarz methods in the suite of preconditioners under the umbrella
6029b92b1d3SBarry Smithof `PCASM`. These may be activated with the runtime option `-pc_type asm`. Various
6039b92b1d3SBarry Smithother options may be set, including the degree of overlap `-pc_asm_overlap <number>` the
6049b92b1d3SBarry Smithtype of restriction/extension `-pc_asm_type [basic,restrict,interpolate,none]` sets ASM
6059b92b1d3SBarry Smithtype and several others. You may see the available ASM options by using `-pc_type asm
6069b92b1d3SBarry Smith-help`. See the procedural interfaces in the manual pages, for example `PCASMType()`
6079b92b1d3SBarry Smithand check the index of the users manual for `PCASMCreateSubdomains()`.
6089b92b1d3SBarry Smith
6099b92b1d3SBarry SmithPETSc also contains a domain decomposition inspired wirebasket or face based two level
6109b92b1d3SBarry Smithmethod where the coarse mesh to fine mesh interpolation is defined by solving specific
6119b92b1d3SBarry Smithlocal subdomain problems. It currently only works for 3D scalar problems on structured
6129b92b1d3SBarry Smithgrids created with PETSc `DMDA`. See the manual page for `PCEXOTIC` and
6139b92b1d3SBarry Smith`src/ksp/ksp/tutorials/ex45.c` for an example.
6149b92b1d3SBarry Smith
6159b92b1d3SBarry SmithPETSc also contains a balancing Neumann-Neumann type preconditioner, see the manual page
6169b92b1d3SBarry Smithfor `PCBDDC`. This requires matrices be constructed with `MatCreateIS()` via the finite
6179b92b1d3SBarry Smithelement method. See `src/ksp/ksp/tests/ex59.c` for an example.
6189b92b1d3SBarry Smith
6199b92b1d3SBarry Smith### You have `MATAIJ` and `MATBAIJ` matrix formats, and `MATSBAIJ` for symmetric storage, how come no `MATSAIJ`?
6209b92b1d3SBarry Smith
6219b92b1d3SBarry SmithJust for historical reasons; the `MATSBAIJ` format with blocksize one is just as efficient as
6229b92b1d3SBarry Smitha `MATSAIJ` would be.
6239b92b1d3SBarry Smith
6249b92b1d3SBarry Smith### Can I Create BAIJ matrices with different size blocks for different block rows?
6259b92b1d3SBarry Smith
6269b92b1d3SBarry SmithNo. The `MATBAIJ` format only supports a single fixed block size on the entire
6279b92b1d3SBarry Smithmatrix. But the `MATAIJ` format automatically searches for matching rows and thus still
6289b92b1d3SBarry Smithtakes advantage of the natural blocks in your matrix to obtain good performance.
6299b92b1d3SBarry Smith
6309b92b1d3SBarry Smith:::{note}
6319b92b1d3SBarry SmithIf you use `MATAIJ` you cannot use the `MatSetValuesBlocked()`.
6329b92b1d3SBarry Smith:::
6339b92b1d3SBarry Smith
6349b92b1d3SBarry Smith### How do I access the values of a remote parallel PETSc Vec?
6359b92b1d3SBarry Smith
6369b92b1d3SBarry Smith1. On each process create a local `Vec` large enough to hold all the values it wishes to
6379b92b1d3SBarry Smith   access.
6389b92b1d3SBarry Smith2. Create a `VecScatter` that scatters from the parallel `Vec` into the local `Vec`.
6399b92b1d3SBarry Smith3. Use `VecGetArray()` to access the values in the local `Vec`.
6409b92b1d3SBarry Smith
6419b92b1d3SBarry SmithFor example, assuming we have distributed a vector `vecGlobal` of size $N$ to
6429b92b1d3SBarry Smith$R$ ranks and each remote rank holds $N/R = m$ values (similarly assume that
6439b92b1d3SBarry Smith$N$ is cleanly divisible by $R$). We want each rank $r$ to gather the
6449b92b1d3SBarry Smithfirst $n$ (also assume $n \leq m$) values from its immediately superior neighbor
6459b92b1d3SBarry Smith$r+1$ (final rank will retrieve from rank 0).
6469b92b1d3SBarry Smith
6479b92b1d3SBarry Smith```
6489b92b1d3SBarry SmithVec            vecLocal;
6499b92b1d3SBarry SmithIS             isLocal, isGlobal;
6509b92b1d3SBarry SmithVecScatter     ctx;
6519b92b1d3SBarry SmithPetscScalar    *arr;
6529b92b1d3SBarry SmithPetscInt       N, firstGlobalIndex;
6539b92b1d3SBarry SmithMPI_Comm       comm;
6549b92b1d3SBarry SmithPetscMPIInt    r, R;
6559b92b1d3SBarry Smith
6569b92b1d3SBarry Smith/* Create sequential local vector, big enough to hold local portion */
6579b92b1d3SBarry SmithPetscCall(VecCreateSeq(PETSC_COMM_SELF, n, &vecLocal));
6589b92b1d3SBarry Smith
6599b92b1d3SBarry Smith/* Create IS to describe where we want to scatter to */
6609b92b1d3SBarry SmithPetscCall(ISCreateStride(PETSC_COMM_SELF, n, 0, 1, &isLocal));
6619b92b1d3SBarry Smith
6629b92b1d3SBarry Smith/* Compute the global indices */
6639b92b1d3SBarry SmithPetscCall(VecGetSize(vecGlobal, &N));
6649b92b1d3SBarry SmithPetscCall(PetscObjectGetComm((PetscObject) vecGlobal, &comm));
6659b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_rank(comm, &r));
6669b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_size(comm, &R));
6679b92b1d3SBarry SmithfirstGlobalIndex = r == R-1 ? 0 : (N/R)*(r+1);
6689b92b1d3SBarry Smith
6699b92b1d3SBarry Smith/* Create IS that describes where we want to scatter from */
6709b92b1d3SBarry SmithPetscCall(ISCreateStride(comm, n, firstGlobalIndex, 1, &isGlobal));
6719b92b1d3SBarry Smith
6729b92b1d3SBarry Smith/* Create the VecScatter context */
6739b92b1d3SBarry SmithPetscCall(VecScatterCreate(vecGlobal, isGlobal, vecLocal, isLocal, &ctx));
6749b92b1d3SBarry Smith
6759b92b1d3SBarry Smith/* Gather the values */
6769b92b1d3SBarry SmithPetscCall(VecScatterBegin(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD));
6779b92b1d3SBarry SmithPetscCall(VecScatterEnd(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD));
6789b92b1d3SBarry Smith
6799b92b1d3SBarry Smith/* Retrieve and do work */
6809b92b1d3SBarry SmithPetscCall(VecGetArray(vecLocal, &arr));
6819b92b1d3SBarry Smith/* Work */
6829b92b1d3SBarry SmithPetscCall(VecRestoreArray(vecLocal, &arr));
6839b92b1d3SBarry Smith
6849b92b1d3SBarry Smith/* Don't forget to clean up */
6859b92b1d3SBarry SmithPetscCall(ISDestroy(&isLocal));
6869b92b1d3SBarry SmithPetscCall(ISDestroy(&isGlobal));
6879b92b1d3SBarry SmithPetscCall(VecScatterDestroy(&ctx));
6889b92b1d3SBarry SmithPetscCall(VecDestroy(&vecLocal));
6899b92b1d3SBarry Smith```
6909b92b1d3SBarry Smith
6919b92b1d3SBarry Smith(doc_faq_usage_alltoone)=
6929b92b1d3SBarry Smith
6939b92b1d3SBarry Smith### How do I collect to a single processor all the values from a parallel PETSc Vec?
6949b92b1d3SBarry Smith
6959b92b1d3SBarry Smith1. Create the `VecScatter` context that will do the communication:
6969b92b1d3SBarry Smith
6979b92b1d3SBarry Smith   ```
6989b92b1d3SBarry Smith   Vec        in_par,out_seq;
6999b92b1d3SBarry Smith   VecScatter ctx;
7009b92b1d3SBarry Smith
7019b92b1d3SBarry Smith   PetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq));
7029b92b1d3SBarry Smith   ```
7039b92b1d3SBarry Smith
7049b92b1d3SBarry Smith2. Initiate the communication (this may be repeated if you wish):
7059b92b1d3SBarry Smith
7069b92b1d3SBarry Smith   ```
7079b92b1d3SBarry Smith   PetscCall(VecScatterBegin(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD));
7089b92b1d3SBarry Smith   PetscCall(VecScatterEnd(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD));
7099b92b1d3SBarry Smith   /* May destroy context now if no additional scatters are needed, otherwise reuse it */
7109b92b1d3SBarry Smith   PetscCall(VecScatterDestroy(&ctx));
7119b92b1d3SBarry Smith   ```
7129b92b1d3SBarry Smith
7139b92b1d3SBarry SmithNote that this simply concatenates in the parallel ordering of the vector (computed by the
7149b92b1d3SBarry Smith`MPI_Comm_rank` of the owning process). If you are using a `Vec` from
7159b92b1d3SBarry Smith`DMCreateGlobalVector()` you likely want to first call `DMDAGlobalToNaturalBegin()`
7169b92b1d3SBarry Smithfollowed by `DMDAGlobalToNaturalEnd()` to scatter the original `Vec` into the natural
7179b92b1d3SBarry Smithordering in a new global `Vec` before calling `VecScatterBegin()`/`VecScatterEnd()`
7189b92b1d3SBarry Smithto scatter the natural `Vec` onto all processes.
7199b92b1d3SBarry Smith
7209b92b1d3SBarry Smith### How do I collect all the values from a parallel PETSc Vec on the 0th rank?
7219b92b1d3SBarry Smith
7229b92b1d3SBarry SmithSee FAQ entry on collecting to {ref}`an arbitrary processor <doc_faq_usage_alltoone>`, but
7239b92b1d3SBarry Smithreplace
7249b92b1d3SBarry Smith
7259b92b1d3SBarry Smith```
7269b92b1d3SBarry SmithPetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq));
7279b92b1d3SBarry Smith```
7289b92b1d3SBarry Smith
7299b92b1d3SBarry Smithwith
7309b92b1d3SBarry Smith
7319b92b1d3SBarry Smith```
7329b92b1d3SBarry SmithPetscCall(VecScatterCreateToZero(in_par, &ctx, &out_seq));
7339b92b1d3SBarry Smith```
7349b92b1d3SBarry Smith
7359b92b1d3SBarry Smith:::{note}
7369b92b1d3SBarry SmithThe same ordering considerations as discussed in the aforementioned entry also apply
7379b92b1d3SBarry Smithhere.
7389b92b1d3SBarry Smith:::
7399b92b1d3SBarry Smith
7409b92b1d3SBarry Smith### How can I read in or write out a sparse matrix in Matrix Market, Harwell-Boeing, Slapc or other ASCII format?
7419b92b1d3SBarry Smith
7429b92b1d3SBarry SmithIf you can read or write your matrix using Python or MATLAB/Octave, `PetscBinaryIO`
7439b92b1d3SBarry Smithmodules are provided at `$PETSC_DIR/lib/petsc/bin` for each language that can assist
7449b92b1d3SBarry Smithwith reading and writing. If you just want to convert `MatrixMarket`, you can use:
7459b92b1d3SBarry Smith
7469b92b1d3SBarry Smith```console
7479b92b1d3SBarry Smith$ python -m $PETSC_DIR/lib/petsc/bin/PetscBinaryIO convert matrix.mtx
7489b92b1d3SBarry Smith```
7499b92b1d3SBarry Smith
7509b92b1d3SBarry SmithTo produce `matrix.petsc`.
7519b92b1d3SBarry Smith
7529b92b1d3SBarry SmithYou can also call the script directly or import it from your Python code. There is also a
7539b92b1d3SBarry Smith`PETScBinaryIO.jl` Julia package.
7549b92b1d3SBarry Smith
7559b92b1d3SBarry SmithFor other formats, either adapt one of the above libraries or see the examples in
7569b92b1d3SBarry Smith`$PETSC_DIR/src/mat/tests`, specifically `ex72.c` or `ex78.c`. You will likely need
7579b92b1d3SBarry Smithto modify the code slightly to match your required ASCII format.
7589b92b1d3SBarry Smith
7599b92b1d3SBarry Smith:::{note}
7609b92b1d3SBarry SmithNever read or write in parallel an ASCII matrix file.
7619b92b1d3SBarry Smith
7629b92b1d3SBarry SmithInstead read in sequentially with a standalone code based on `ex72.c` or `ex78.c`
7639b92b1d3SBarry Smiththen save the matrix with the binary viewer `PetscViewerBinaryOpen()` and load the
7649b92b1d3SBarry Smithmatrix in parallel in your "real" PETSc program with `MatLoad()`.
7659b92b1d3SBarry Smith
7669b92b1d3SBarry SmithFor writing save with the binary viewer and then load with the sequential code to store
7679b92b1d3SBarry Smithit as ASCII.
7689b92b1d3SBarry Smith:::
7699b92b1d3SBarry Smith
7709b92b1d3SBarry Smith### Does TSSetFromOptions(), SNESSetFromOptions(), or KSPSetFromOptions() reset all the parameters I previously set or how come do they not seem to work?
7719b92b1d3SBarry Smith
7729b92b1d3SBarry SmithIf `XXSetFromOptions()` is used (with `-xxx_type aaaa`) to change the type of the
7739b92b1d3SBarry Smithobject then all parameters associated with the previous type are removed. Otherwise it
7749b92b1d3SBarry Smithdoes not reset parameters.
7759b92b1d3SBarry Smith
7769b92b1d3SBarry Smith`TS/SNES/KSPSetXXX()` commands that set properties for a particular type of object (such
7779b92b1d3SBarry Smithas `KSPGMRESSetRestart()`) ONLY work if the object is ALREADY of that type. For example,
7789b92b1d3SBarry Smithwith
7799b92b1d3SBarry Smith
7809b92b1d3SBarry Smith```
7819b92b1d3SBarry SmithKSP ksp;
7829b92b1d3SBarry Smith
7839b92b1d3SBarry SmithPetscCall(KSPCreate(PETSC_COMM_WORLD, &ksp));
7849b92b1d3SBarry SmithPetscCall(KSPGMRESSetRestart(ksp, 10));
7859b92b1d3SBarry Smith```
7869b92b1d3SBarry Smith
7879b92b1d3SBarry Smiththe restart will be ignored since the type has not yet been set to `KSPGMRES`. To have
7889b92b1d3SBarry Smiththose values take effect you should do one of the following:
7899b92b1d3SBarry Smith
7909b92b1d3SBarry Smith- Allow setting the type from the command line, if it is not on the command line then the
7919b92b1d3SBarry Smith  default type is automatically set.
7929b92b1d3SBarry Smith
7939b92b1d3SBarry Smith```
7949b92b1d3SBarry Smith/* Create generic object */
7959b92b1d3SBarry SmithXXXCreate(..,&obj);
7969b92b1d3SBarry Smith/* Must set all settings here, or default */
7979b92b1d3SBarry SmithXXXSetFromOptions(obj);
7989b92b1d3SBarry Smith```
7999b92b1d3SBarry Smith
8009b92b1d3SBarry Smith- Hardwire the type in the code, but allow the user to override it via a subsequent
8019b92b1d3SBarry Smith  `XXXSetFromOptions()` call. This essentially allows the user to customize what the
8029b92b1d3SBarry Smith  "default" type to of the object.
8039b92b1d3SBarry Smith
8049b92b1d3SBarry Smith```
8059b92b1d3SBarry Smith/* Create generic object */
8069b92b1d3SBarry SmithXXXCreate(..,&obj);
8079b92b1d3SBarry Smith/* Set type directly */
8089b92b1d3SBarry SmithXXXSetYYYYY(obj,...);
8099b92b1d3SBarry Smith/* Can always change to different type */
8109b92b1d3SBarry SmithXXXSetFromOptions(obj);
8119b92b1d3SBarry Smith```
8129b92b1d3SBarry Smith
8139b92b1d3SBarry Smith### How do I compile and link my own PETSc application codes and can I use my own `makefile` or rules for compiling code, rather than PETSc's?
8149b92b1d3SBarry Smith
8159b92b1d3SBarry SmithSee the {ref}`section <sec_writing_application_codes>` of the users manual on writing
8169b92b1d3SBarry Smithapplication codes with PETSc.
8179b92b1d3SBarry Smith
8189b92b1d3SBarry Smith### Can I use CMake to build my own project that depends on PETSc?
8199b92b1d3SBarry Smith
8209b92b1d3SBarry SmithSee the {ref}`section <sec_writing_application_codes>` of the users manual on writing
8219b92b1d3SBarry Smithapplication codes with PETSc.
8229b92b1d3SBarry Smith
8239b92b1d3SBarry Smith### How can I put carriage returns in `PetscPrintf()` statements from Fortran?
8249b92b1d3SBarry Smith
8259b92b1d3SBarry SmithYou can use the same notation as in C, just put a `\n` in the string. Note that no other C
8269b92b1d3SBarry Smithformat instruction is supported.
8279b92b1d3SBarry Smith
8289b92b1d3SBarry SmithOr you can use the Fortran concatenation `//` and `char(10)`; for example `'some
8299b92b1d3SBarry Smithstring'//char(10)//'another string` on the next line.
8309b92b1d3SBarry Smith
8319b92b1d3SBarry Smith### How can I implement callbacks using C++ class methods?
8329b92b1d3SBarry Smith
8339b92b1d3SBarry SmithDeclare the class method static. Static methods do not have a `this` pointer, but the
8349b92b1d3SBarry Smith`void*` context parameter will usually be cast to a pointer to the class where it can
8359b92b1d3SBarry Smithserve the same function.
8369b92b1d3SBarry Smith
8379b92b1d3SBarry Smith:::{admonition} Remember
8389b92b1d3SBarry SmithAll PETSc callbacks return `PetscErrorCode`.
8399b92b1d3SBarry Smith:::
8409b92b1d3SBarry Smith
8419b92b1d3SBarry Smith### Everyone knows that when you code Newton's Method you should compute the function and its Jacobian at the same time. How can one do this in PETSc?
8429b92b1d3SBarry Smith
8439b92b1d3SBarry SmithThe update in Newton's method is computed as
8449b92b1d3SBarry Smith
8459b92b1d3SBarry Smith$$
8469b92b1d3SBarry Smithu^{n+1} = u^n - \lambda * \left[J(u^n)] * F(u^n) \right]^{\dagger}
8479b92b1d3SBarry Smith$$
8489b92b1d3SBarry Smith
8499b92b1d3SBarry SmithThe reason PETSc doesn't default to computing both the function and Jacobian at the same
8509b92b1d3SBarry Smithtime is:
8519b92b1d3SBarry Smith
8529b92b1d3SBarry Smith- In order to do the line search $F \left(u^n - \lambda * \text{step} \right)$ may
8539b92b1d3SBarry Smith  need to be computed for several $\lambda$. The Jacobian is not needed for each of
8549b92b1d3SBarry Smith  those and one does not know in advance which will be the final $\lambda$ until
8559b92b1d3SBarry Smith  after the function value is computed, so many extra Jacobians may be computed.
8569b92b1d3SBarry Smith- In the final step if $|| F(u^p)||$ satisfies the convergence criteria then a
8579b92b1d3SBarry Smith  Jacobian need not be computed.
8589b92b1d3SBarry Smith
8599b92b1d3SBarry SmithYou are free to have your `FormFunction()` compute as much of the Jacobian at that point
8609b92b1d3SBarry Smithas you like, keep the information in the user context (the final argument to
8619b92b1d3SBarry Smith`FormFunction()` and `FormJacobian()`) and then retrieve the information in your
8629b92b1d3SBarry Smith`FormJacobian()` function.
8639b92b1d3SBarry Smith
8649b92b1d3SBarry Smith### Computing the Jacobian or preconditioner is time consuming. Is there any way to compute it less often?
8659b92b1d3SBarry Smith
8669b92b1d3SBarry SmithPETSc has a variety of ways of lagging the computation of the Jacobian or the
8679b92b1d3SBarry Smithpreconditioner. They are documented in the manual page for `SNESComputeJacobian()`
8689b92b1d3SBarry Smithand in the {ref}`users manual <ch_snes>`:
8699b92b1d3SBarry Smith
8709b92b1d3SBarry Smith-s
8719b92b1d3SBarry Smith
8729b92b1d3SBarry Smithnes_lag_jacobian
8739b92b1d3SBarry Smith
8749b92b1d3SBarry Smith(`SNESSetLagJacobian()`) How often Jacobian is rebuilt (use -1 to
8759b92b1d3SBarry Smithnever rebuild, use -2 to rebuild the next time requested and then
8769b92b1d3SBarry Smithnever again).
8779b92b1d3SBarry Smith
8789b92b1d3SBarry Smith-s
8799b92b1d3SBarry Smith
8809b92b1d3SBarry Smithnes_lag_jacobian_persists
8819b92b1d3SBarry Smith
8829b92b1d3SBarry Smith(`SNESSetLagJacobianPersists()`) Forces lagging of Jacobian
8839b92b1d3SBarry Smiththrough multiple `SNES` solves , same as passing -2 to
8849b92b1d3SBarry Smith`-snes_lag_jacobian`. By default, each new `SNES` solve
8859b92b1d3SBarry Smithnormally triggers a recomputation of the Jacobian.
8869b92b1d3SBarry Smith
8879b92b1d3SBarry Smith-s
8889b92b1d3SBarry Smith
8899b92b1d3SBarry Smithnes_lag_preconditioner
8909b92b1d3SBarry Smith
8919b92b1d3SBarry Smith(`SNESSetLagPreconditioner()`) how often the preconditioner is
8929b92b1d3SBarry Smithrebuilt. Note: if you are lagging the Jacobian the system will
8939b92b1d3SBarry Smithknow that the matrix has not changed and will not recompute the
8949b92b1d3SBarry Smith(same) preconditioner.
8959b92b1d3SBarry Smith
8969b92b1d3SBarry Smith-s
8979b92b1d3SBarry Smith
8989b92b1d3SBarry Smithnes_lag_preconditioner_persists
8999b92b1d3SBarry Smith
9009b92b1d3SBarry Smith(`SNESSetLagPreconditionerPersists()`) Preconditioner
9019b92b1d3SBarry Smithlags through multiple `SNES` solves
9029b92b1d3SBarry Smith
9039b92b1d3SBarry Smith:::{note}
9049b92b1d3SBarry SmithThese are often (but does not need to be) used in combination with
9059b92b1d3SBarry Smith`-snes_mf_operator` which applies the fresh Jacobian matrix-free for every
9069b92b1d3SBarry Smithmatrix-vector product. Otherwise the out-of-date matrix vector product, computed with
9079b92b1d3SBarry Smiththe lagged Jacobian will be used.
9089b92b1d3SBarry Smith:::
9099b92b1d3SBarry Smith
9109b92b1d3SBarry SmithBy using `KSPMonitorSet()` and/or `SNESMonitorSet()` one can provide code that monitors the
9119b92b1d3SBarry Smithconvergence rate and automatically triggers an update of the Jacobian or preconditioner
9129b92b1d3SBarry Smithbased on decreasing convergence of the iterative method. For example if the number of `SNES`
9139b92b1d3SBarry Smithiterations doubles one might trigger a new computation of the Jacobian. Experimentation is
9149b92b1d3SBarry Smiththe only general purpose way to determine which approach is best for your problem.
9159b92b1d3SBarry Smith
9169b92b1d3SBarry Smith:::{important}
9179b92b1d3SBarry SmithIt is also vital to experiment on your true problem at the scale you will be solving
9189b92b1d3SBarry Smiththe problem since the performance benefits depend on the exact problem and the problem
9199b92b1d3SBarry Smithsize!
9209b92b1d3SBarry Smith:::
9219b92b1d3SBarry Smith
9229b92b1d3SBarry Smith### How can I use Newton's Method Jacobian free? Can I difference a different function than provided with `SNESSetFunction()`?
9239b92b1d3SBarry Smith
9249b92b1d3SBarry SmithThe simplest way is with the option `-snes_mf`, this will use finite differencing of the
9259b92b1d3SBarry Smithfunction provided to `SNESComputeFunction()` to approximate the action of Jacobian.
9269b92b1d3SBarry Smith
9279b92b1d3SBarry Smith:::{important}
9289b92b1d3SBarry SmithSince no matrix-representation of the Jacobian is provided the `-pc_type` used with
9299b92b1d3SBarry Smiththis option must be `-pc_type none`. You may provide a custom preconditioner with
9309b92b1d3SBarry Smith`SNESGetKSP()`, `KSPGetPC()`, and `PCSetType()` and use `PCSHELL`.
9319b92b1d3SBarry Smith:::
9329b92b1d3SBarry Smith
9339b92b1d3SBarry SmithThe option `-snes_mf_operator` will use Jacobian free to apply the Jacobian (in the
9349b92b1d3SBarry SmithKrylov solvers) but will use whatever matrix you provided with `SNESSetJacobian()`
9359b92b1d3SBarry Smith(assuming you set one) to compute the preconditioner.
9369b92b1d3SBarry Smith
9379b92b1d3SBarry SmithTo write the code (rather than use the options above) use `MatCreateSNESMF()` and pass
9389b92b1d3SBarry Smiththe resulting matrix object to `SNESSetJacobian()`.
9399b92b1d3SBarry Smith
9409b92b1d3SBarry SmithFor purely matrix-free (like `-snes_mf`) pass the matrix object for both matrix
9419b92b1d3SBarry Smitharguments and pass the function `MatMFFDComputeJacobian()`.
9429b92b1d3SBarry Smith
9439b92b1d3SBarry SmithTo provide your own approximate Jacobian matrix to compute the preconditioner (like
9449b92b1d3SBarry Smith`-snes_mf_operator`), pass this other matrix as the second matrix argument to
9459b92b1d3SBarry Smith`SNESSetJacobian()`. Make sure your provided `computejacobian()` function calls
9469b92b1d3SBarry Smith`MatAssemblyBegin()` and `MatAssemblyEnd()` separately on **BOTH** matrix arguments
9479b92b1d3SBarry Smithto this function. See `src/snes/tests/ex7.c`.
9489b92b1d3SBarry Smith
9499b92b1d3SBarry SmithTo difference a different function than that passed to `SNESSetJacobian()` to compute the
9509b92b1d3SBarry Smithmatrix-free Jacobian multiply call `MatMFFDSetFunction()` to set that other function. See
9519b92b1d3SBarry Smith`src/snes/tests/ex7.c.h`.
9529b92b1d3SBarry Smith
9539b92b1d3SBarry Smith(doc_faq_usage_condnum)=
9549b92b1d3SBarry Smith
9559b92b1d3SBarry Smith### How can I determine the condition number of a matrix?
9569b92b1d3SBarry Smith
9579b92b1d3SBarry SmithFor small matrices, the condition number can be reliably computed using
9589b92b1d3SBarry Smith
9599b92b1d3SBarry Smith```text
9609b92b1d3SBarry Smith-pc_type svd -pc_svd_monitor
9619b92b1d3SBarry Smith```
9629b92b1d3SBarry Smith
9639b92b1d3SBarry SmithFor larger matrices, you can run with
9649b92b1d3SBarry Smith
9659b92b1d3SBarry Smith```text
9669b92b1d3SBarry Smith-pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000
9679b92b1d3SBarry Smith```
9689b92b1d3SBarry Smith
9699b92b1d3SBarry Smithto get approximations to the condition number of the operator. This will generally be
9709b92b1d3SBarry Smithaccurate for the largest singular values, but may overestimate the smallest singular value
9719b92b1d3SBarry Smithunless the method has converged. Make sure to avoid restarts. To estimate the condition
9729b92b1d3SBarry Smithnumber of the preconditioned operator, use `-pc_type somepc` in the last command.
9739b92b1d3SBarry Smith
9749b92b1d3SBarry SmithYou can use [SLEPc](https://slepc.upv.es) for highly scalable, efficient, and quality eigenvalue computations.
9759b92b1d3SBarry Smith
9769b92b1d3SBarry Smith### How can I compute the inverse of a matrix in PETSc?
9779b92b1d3SBarry Smith
9789b92b1d3SBarry Smith:::{admonition} Are you sure?
9799b92b1d3SBarry Smith:class: yellow
9809b92b1d3SBarry Smith
9819b92b1d3SBarry SmithIt is very expensive to compute the inverse of a matrix and very rarely needed in
9829b92b1d3SBarry Smithpractice. We highly recommend avoiding algorithms that need it.
9839b92b1d3SBarry Smith:::
9849b92b1d3SBarry Smith
9859b92b1d3SBarry SmithThe inverse of a matrix (dense or sparse) is essentially always dense, so begin by
9869b92b1d3SBarry Smithcreating a dense matrix B and fill it with the identity matrix (ones along the diagonal),
9879b92b1d3SBarry Smithalso create a dense matrix X of the same size that will hold the solution. Then factor the
9889b92b1d3SBarry Smithmatrix you wish to invert with `MatLUFactor()` or `MatCholeskyFactor()`, call the
9899b92b1d3SBarry Smithresult A. Then call `MatMatSolve(A,B,X)` to compute the inverse into X. See also section
9909b92b1d3SBarry Smithon {any}`Schur's complement <how_can_i_compute_the_schur_complement>`.
9919b92b1d3SBarry Smith
9929b92b1d3SBarry Smith(how_can_i_compute_the_schur_complement)=
9939b92b1d3SBarry Smith
9949b92b1d3SBarry Smith### How can I compute the Schur complement in PETSc?
9959b92b1d3SBarry Smith
9969b92b1d3SBarry Smith:::{admonition} Are you sure?
9979b92b1d3SBarry Smith:class: yellow
9989b92b1d3SBarry Smith
9999b92b1d3SBarry SmithIt is very expensive to compute the Schur complement of a matrix and very rarely needed
10009b92b1d3SBarry Smithin practice. We highly recommend avoiding algorithms that need it.
10019b92b1d3SBarry Smith:::
10029b92b1d3SBarry Smith
10039b92b1d3SBarry SmithThe Schur complement of the matrix $M \in \mathbb{R}^{\left(p+q \right) \times
10049b92b1d3SBarry Smith\left(p + q \right)}$
10059b92b1d3SBarry Smith
10069b92b1d3SBarry Smith$$
10079b92b1d3SBarry SmithM = \begin{bmatrix}
10089b92b1d3SBarry SmithA & B \\
10099b92b1d3SBarry SmithC & D
10109b92b1d3SBarry Smith\end{bmatrix}
10119b92b1d3SBarry Smith$$
10129b92b1d3SBarry Smith
10139b92b1d3SBarry Smithwhere
10149b92b1d3SBarry Smith
10159b92b1d3SBarry Smith$$
10169b92b1d3SBarry SmithA \in \mathbb{R}^{p \times p}, \quad B \in \mathbb{R}^{p \times q}, \quad C \in \mathbb{R}^{q \times p}, \quad D \in \mathbb{R}^{q \times q}
10179b92b1d3SBarry Smith$$
10189b92b1d3SBarry Smith
10199b92b1d3SBarry Smithis given by
10209b92b1d3SBarry Smith
10219b92b1d3SBarry Smith$$
10229b92b1d3SBarry SmithS_D := A - BD^{-1}C \\
10239b92b1d3SBarry SmithS_A := D - CA^{-1}B
10249b92b1d3SBarry Smith$$
10259b92b1d3SBarry Smith
10269b92b1d3SBarry SmithLike the inverse, the Schur complement of a matrix (dense or sparse) is essentially always
10279b92b1d3SBarry Smithdense, so assuming you wish to calculate $S_A = D - C \underbrace{
10289b92b1d3SBarry Smith\overbrace{(A^{-1})}^{U} B}_{V}$ begin by:
10299b92b1d3SBarry Smith
10309b92b1d3SBarry Smith1. Forming a dense matrix $B$
10319b92b1d3SBarry Smith2. Also create another dense matrix $V$ of the same size.
10329b92b1d3SBarry Smith3. Then either factor the matrix $A$ directly with `MatLUFactor()` or
10339b92b1d3SBarry Smith   `MatCholeskyFactor()`, or use `MatGetFactor()` followed by
10349b92b1d3SBarry Smith   `MatLUFactorSymbolic()` followed by `MatLUFactorNumeric()` if you wish to use and
10359b92b1d3SBarry Smith   external solver package like SuperLU_Dist. Call the result $U$.
10369b92b1d3SBarry Smith4. Then call `MatMatSolve(U,B,V)`.
10379b92b1d3SBarry Smith5. Then call `MatMatMult(C,V,MAT_INITIAL_MATRIX,1.0,&S)`.
10389b92b1d3SBarry Smith6. Now call `MatAXPY(S,-1.0,D,MAT_SUBSET_NONZERO)`.
10399b92b1d3SBarry Smith7. Followed by `MatScale(S,-1.0)`.
10409b92b1d3SBarry Smith
10419b92b1d3SBarry SmithFor computing Schur complements like this it does not make sense to use the `KSP`
10429b92b1d3SBarry Smithiterative solvers since for solving many moderate size problems using a direct
10439b92b1d3SBarry Smithfactorization is much faster than iterative solvers. As you can see, this requires a great
10449b92b1d3SBarry Smithdeal of work space and computation so is best avoided.
10459b92b1d3SBarry Smith
10469b92b1d3SBarry SmithHowever, it is not necessary to assemble the Schur complement $S$ in order to solve
10479b92b1d3SBarry Smithsystems with it. Use `MatCreateSchurComplement(A,A_pre,B,C,D,&S)` to create a
10489b92b1d3SBarry Smithmatrix that applies the action of $S$ (using `A_pre` to solve with `A`), but
10499b92b1d3SBarry Smithdoes not assemble.
10509b92b1d3SBarry Smith
10519b92b1d3SBarry SmithAlternatively, if you already have a block matrix `M = [A, B; C, D]` (in some
10529b92b1d3SBarry Smithordering), then you can create index sets (`IS`) `isa` and `isb` to address each
10539b92b1d3SBarry Smithblock, then use `MatGetSchurComplement()` to create the Schur complement and/or an
10549b92b1d3SBarry Smithapproximation suitable for preconditioning.
10559b92b1d3SBarry Smith
10569b92b1d3SBarry SmithSince $S$ is generally dense, standard preconditioning methods cannot typically be
10579b92b1d3SBarry Smithapplied directly to Schur complements. There are many approaches to preconditioning Schur
10589b92b1d3SBarry Smithcomplements including using the `SIMPLE` approximation
10599b92b1d3SBarry Smith
10609b92b1d3SBarry Smith$$
10619b92b1d3SBarry SmithD - C \text{diag}(A)^{-1} B
10629b92b1d3SBarry Smith$$
10639b92b1d3SBarry Smith
10649b92b1d3SBarry Smithto create a sparse matrix that approximates the Schur complement (this is returned by
10659b92b1d3SBarry Smithdefault for the optional "preconditioning" matrix in `MatGetSchurComplement()`).
10669b92b1d3SBarry Smith
10679b92b1d3SBarry SmithAn alternative is to interpret the matrices as differential operators and apply
10689b92b1d3SBarry Smithapproximate commutator arguments to find a spectrally equivalent operation that can be
10699b92b1d3SBarry Smithapplied efficiently (see the "PCD" preconditioners {cite}`elman_silvester_wathen_2014`). A
10709b92b1d3SBarry Smithvariant of this is the least squares commutator, which is closely related to the
10719b92b1d3SBarry SmithMoore-Penrose pseudoinverse, and is available in `PCLSC` which operates on matrices of
10729b92b1d3SBarry Smithtype `MATSCHURCOMPLEMENT`.
10739b92b1d3SBarry Smith
10749b92b1d3SBarry Smith### Do you have examples of doing unstructured grid Finite Element Method (FEM) with PETSc?
10759b92b1d3SBarry Smith
10769b92b1d3SBarry SmithThere are at least three ways to write finite element codes using PETSc:
10779b92b1d3SBarry Smith
10789b92b1d3SBarry Smith1. Use `DMPLEX`, which is a high level approach to manage your mesh and
10799b92b1d3SBarry Smith   discretization. See the {ref}`tutorials sections <tut_stokes>` for further information,
10809b92b1d3SBarry Smith   or see `src/snes/tutorial/ex62.c`.
10819b92b1d3SBarry Smith2. Use packages such as [deal.ii](https://www.dealii.org), [libMesh](https://libmesh.github.io), or
10829b92b1d3SBarry Smith   [Firedrake](https://www.firedrakeproject.org), which use PETSc for the solvers.
10839b92b1d3SBarry Smith3. Manage the grid data structure yourself and use PETSc `PetscSF`, `IS`, and `VecScatter` to
10849b92b1d3SBarry Smith   communicate the required ghost point communication. See
10859b92b1d3SBarry Smith   `src/snes/tutorials/ex10d/ex10.c`.
10869b92b1d3SBarry Smith
10879b92b1d3SBarry Smith### DMDA decomposes the domain differently than the MPI_Cart_create() command. How can one use them together?
10889b92b1d3SBarry Smith
10899b92b1d3SBarry SmithThe `MPI_Cart_create()` first divides the mesh along the z direction, then the y, then
10909b92b1d3SBarry Smiththe x. `DMDA` divides along the x, then y, then z. Thus, for example, rank 1 of the
10919b92b1d3SBarry Smithprocesses will be in a different part of the mesh for the two schemes. To resolve this you
10929b92b1d3SBarry Smithcan create a new MPI communicator that you pass to `DMDACreate()` that renumbers the
10939b92b1d3SBarry Smithprocess ranks so that each physical process shares the same part of the mesh with both the
10949b92b1d3SBarry Smith`DMDA` and the `MPI_Cart_create()`. The code to determine the new numbering was
10959b92b1d3SBarry Smithprovided by Rolf Kuiper:
10969b92b1d3SBarry Smith
10979b92b1d3SBarry Smith```
10989b92b1d3SBarry Smith// the numbers of processors per direction are (int) x_procs, y_procs, z_procs respectively
10999b92b1d3SBarry Smith// (no parallelization in direction 'dir' means dir_procs = 1)
11009b92b1d3SBarry Smith
11019b92b1d3SBarry SmithMPI_Comm    NewComm;
11029b92b1d3SBarry Smithint         x, y, z;
11039b92b1d3SBarry SmithPetscMPIInt MPI_Rank, NewRank;
11049b92b1d3SBarry Smith
11059b92b1d3SBarry Smith// get rank from MPI ordering:
11069b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &MPI_Rank));
11079b92b1d3SBarry Smith
11089b92b1d3SBarry Smith// calculate coordinates of cpus in MPI ordering:
11099b92b1d3SBarry Smithx = MPI_rank / (z_procs*y_procs);
11109b92b1d3SBarry Smithy = (MPI_rank % (z_procs*y_procs)) / z_procs;
11119b92b1d3SBarry Smithz = (MPI_rank % (z_procs*y_procs)) % z_procs;
11129b92b1d3SBarry Smith
11139b92b1d3SBarry Smith// set new rank according to PETSc ordering:
11149b92b1d3SBarry SmithNewRank = z*y_procs*x_procs + y*x_procs + x;
11159b92b1d3SBarry Smith
11169b92b1d3SBarry Smith// create communicator with new ranks according to PETSc ordering
11179b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &NewComm));
11189b92b1d3SBarry Smith
11199b92b1d3SBarry Smith// override the default communicator (was MPI_COMM_WORLD as default)
11209b92b1d3SBarry SmithPETSC_COMM_WORLD = NewComm;
11219b92b1d3SBarry Smith```
11229b92b1d3SBarry Smith
11239b92b1d3SBarry Smith### When solving a system with Dirichlet boundary conditions I can use MatZeroRows() to eliminate the Dirichlet rows but this results in a non-Symmetric system. How can I apply Dirichlet boundary conditions but keep the matrix symmetric?
11249b92b1d3SBarry Smith
11259b92b1d3SBarry Smith- For nonsymmetric systems put the appropriate boundary solutions in the x vector and use
11269b92b1d3SBarry Smith  `MatZeroRows()` followed by `KSPSetOperators()`.
11279b92b1d3SBarry Smith- For symmetric problems use `MatZeroRowsColumns()`.
11289b92b1d3SBarry Smith- If you have many Dirichlet locations you can use `MatZeroRows()` (**not**
11299b92b1d3SBarry Smith  `MatZeroRowsColumns()`) and `-ksp_type preonly -pc_type redistribute` (see
11309b92b1d3SBarry Smith  `PCREDISTRIBUTE`) and PETSc will repartition the parallel matrix for load
11319b92b1d3SBarry Smith  balancing. In this case the new matrix solved remains symmetric even though
11329b92b1d3SBarry Smith  `MatZeroRows()` is used.
11339b92b1d3SBarry Smith
11349b92b1d3SBarry SmithAn alternative approach is, when assembling the matrix (generating values and passing
11359b92b1d3SBarry Smiththem to the matrix), never to include locations for the Dirichlet grid points in the
11369b92b1d3SBarry Smithvector and matrix, instead taking them into account as you put the other values into the
11379b92b1d3SBarry Smithload.
11389b92b1d3SBarry Smith
11399b92b1d3SBarry Smith### How can I get PETSc vectors and matrices to MATLAB or vice versa?
11409b92b1d3SBarry Smith
11419b92b1d3SBarry SmithThere are numerous ways to work with PETSc and MATLAB. All but the first approach
11429b92b1d3SBarry Smithrequire PETSc to be configured with --with-matlab.
11439b92b1d3SBarry Smith
11449b92b1d3SBarry Smith1. To save PETSc `Mat` and `Vec` to files that can be read from MATLAB use
11459b92b1d3SBarry Smith   `PetscViewerBinaryOpen()` viewer and `VecView()` or `MatView()` to save objects
11469b92b1d3SBarry Smith   for MATLAB and `VecLoad()` and `MatLoad()` to get the objects that MATLAB has
11479b92b1d3SBarry Smith   saved. See `share/petsc/matlab/PetscBinaryRead.m` and
11489b92b1d3SBarry Smith   `share/petsc/matlab/PetscBinaryWrite.m` for loading and saving the objects in MATLAB.
11499b92b1d3SBarry Smith2. Using the [MATLAB Engine](https://www.mathworks.com/help/matlab/calling-matlab-engine-from-c-programs-1.html),
11509b92b1d3SBarry Smith   allows PETSc to automatically call MATLAB to perform some specific computations. This
11519b92b1d3SBarry Smith   does not allow MATLAB to be used interactively by the user. See the
11529b92b1d3SBarry Smith   `PetscMatlabEngine`.
11539b92b1d3SBarry Smith3. You can open a socket connection between MATLAB and PETSc to allow sending objects back
11549b92b1d3SBarry Smith   and forth between an interactive MATLAB session and a running PETSc program. See
11559b92b1d3SBarry Smith   `PetscViewerSocketOpen()` for access from the PETSc side and
11569b92b1d3SBarry Smith   `share/petsc/matlab/PetscReadBinary.m` for access from the MATLAB side.
11579b92b1d3SBarry Smith4. You can save PETSc `Vec` (**not** `Mat`) with the `PetscViewerMatlabOpen()`
11589b92b1d3SBarry Smith   viewer that saves `.mat` files can then be loaded into MATLAB using the `load()` command
11599b92b1d3SBarry Smith
11609b92b1d3SBarry Smith### How do I get started with Cython so that I can extend petsc4py?
11619b92b1d3SBarry Smith
11629b92b1d3SBarry Smith1. Learn how to [build a Cython module](http://docs.cython.org/src/quickstart/build.html).
11639b92b1d3SBarry Smith2. Go through the simple [example](https://stackoverflow.com/questions/3046305/simple-wrapping-of-c-code-with-cython). Note
11649b92b1d3SBarry Smith   also the next comment that shows how to create numpy arrays in the Cython and pass them
11659b92b1d3SBarry Smith   back.
11669b92b1d3SBarry Smith3. Check out [this page](http://docs.cython.org/src/tutorial/numpy.html) which tells
11679b92b1d3SBarry Smith   you how to get fast indexing.
11689b92b1d3SBarry Smith4. Have a look at the petsc4py [array source](http://code.google.com/p/petsc4py/source/browse/src/PETSc/arraynpy.pxi).
11699b92b1d3SBarry Smith
11709b92b1d3SBarry Smith### How do I compute a custom norm for KSP to use as a convergence test or for monitoring?
11719b92b1d3SBarry Smith
11729b92b1d3SBarry SmithYou need to call `KSPBuildResidual()` on your `KSP` object and then compute the
11739b92b1d3SBarry Smithappropriate norm on the resulting residual. Note that depending on the
11749b92b1d3SBarry Smith`KSPSetNormType()` of the method you may not return the same norm as provided by the
11759b92b1d3SBarry Smithmethod. See also `KSPSetPCSide()`.
11769b92b1d3SBarry Smith
11779b92b1d3SBarry Smith### If I have a sequential program can I use a PETSc parallel solver?
11789b92b1d3SBarry Smith
11799b92b1d3SBarry Smith:::{important}
11809b92b1d3SBarry SmithDo not expect to get great speedups! Much of the speedup gained by using parallel
11819b92b1d3SBarry Smithsolvers comes from building the underlying matrices and vectors in parallel to begin
11829b92b1d3SBarry Smithwith. You should see some reduction in the time for the linear solvers.
11839b92b1d3SBarry Smith:::
11849b92b1d3SBarry Smith
11859b92b1d3SBarry SmithYes, you must set up PETSc with MPI (even though you will not use MPI) with at least the
11869b92b1d3SBarry Smithfollowing options:
11879b92b1d3SBarry Smith
11889b92b1d3SBarry Smith```console
11899b92b1d3SBarry Smith$ ./configure --download-superlu_dist --download-parmetis --download-metis --with-openmp
11909b92b1d3SBarry Smith```
11919b92b1d3SBarry Smith
11929b92b1d3SBarry SmithYour compiler must support OpenMP. To have the linear solver run in parallel run your
11939b92b1d3SBarry Smithprogram with
11949b92b1d3SBarry Smith
11959b92b1d3SBarry Smith```console
11969b92b1d3SBarry Smith$ OMP_NUM_THREADS=n ./myprog -pc_type lu -pc_factor_mat_solver superlu_dist
11979b92b1d3SBarry Smith```
11989b92b1d3SBarry Smith
11999b92b1d3SBarry Smithwhere `n` is the number of threads and should be less than or equal to the number of cores
12009b92b1d3SBarry Smithavailable.
12019b92b1d3SBarry Smith
12029b92b1d3SBarry Smith:::{note}
12039b92b1d3SBarry SmithIf your code is MPI parallel you can also use these same options to have SuperLU_dist
12049b92b1d3SBarry Smithutilize multiple threads per MPI process for the direct solver. Make sure that the
12059b92b1d3SBarry Smith`$OMP_NUM_THREADS` you use per MPI process is less than or equal to the number of
12069b92b1d3SBarry Smithcores available for each MPI process. For example if your compute nodes have 6 cores
12079b92b1d3SBarry Smithand you use 2 MPI processes per node then set `$OMP_NUM_THREADS` to 2 or 3.
12089b92b1d3SBarry Smith:::
12099b92b1d3SBarry Smith
12109b92b1d3SBarry SmithAnother approach that allows using a PETSc parallel solver is to use `PCMPI`.
12119b92b1d3SBarry Smith
12129b92b1d3SBarry Smith### TS or SNES produces infeasible (out of domain) solutions or states. How can I prevent this?
12139b92b1d3SBarry Smith
12149b92b1d3SBarry SmithFor `TS` call `TSSetFunctionDomainError()`. For both `TS` and `SNES` call
12159b92b1d3SBarry Smith`SNESSetFunctionDomainError()` when the solver passes an infeasible (out of domain)
12169b92b1d3SBarry Smithsolution or state to your routines.
12179b92b1d3SBarry Smith
12189b92b1d3SBarry SmithIf it occurs for DAEs, it is important to insure the algebraic constraints are well
12199b92b1d3SBarry Smithsatisfied, which can prevent "breakdown" later. Thus, one can try using a tight tolerance
12209b92b1d3SBarry Smithfor `SNES`, using a direct linear solver (`PCType` of `PCLU`) when possible, and reducing the timestep (or
12219b92b1d3SBarry Smithtightening `TS` tolerances for adaptive time stepping).
12229b92b1d3SBarry Smith
12239b92b1d3SBarry Smith### Can PETSc work with Hermitian matrices?
12249b92b1d3SBarry Smith
12259b92b1d3SBarry SmithPETSc's support of Hermitian matrices is limited. Many operations and solvers work
12269b92b1d3SBarry Smithfor symmetric (`MATSBAIJ`) matrices and operations on transpose matrices but there is
12279b92b1d3SBarry Smithlittle direct support for Hermitian matrices and Hermitian transpose (complex conjugate
12289b92b1d3SBarry Smithtranspose) operations. There is `KSPSolveTranspose()` for solving the transpose of a
12299b92b1d3SBarry Smithlinear system but no `KSPSolveHermitian()`.
12309b92b1d3SBarry Smith
12319b92b1d3SBarry SmithFor creating known Hermitian matrices:
12329b92b1d3SBarry Smith
12339b92b1d3SBarry Smith- `MatCreateNormalHermitian()`
12349b92b1d3SBarry Smith- `MatCreateHermitianTranspose()`
12359b92b1d3SBarry Smith
12369b92b1d3SBarry SmithFor determining or setting Hermitian status on existing matrices:
12379b92b1d3SBarry Smith
12389b92b1d3SBarry Smith- `MatIsHermitian()`
12399b92b1d3SBarry Smith- `MatIsHermitianKnown()`
12409b92b1d3SBarry Smith- `MatIsStructurallySymmetric()`
12419b92b1d3SBarry Smith- `MatIsSymmetricKnown()`
12429b92b1d3SBarry Smith- `MatIsSymmetric()`
12439b92b1d3SBarry Smith- `MatSetOption()` (use with `MAT_SYMMETRIC` or `MAT_HERMITIAN` to assert to PETSc
12449b92b1d3SBarry Smith  that either is the case).
12459b92b1d3SBarry Smith
12469b92b1d3SBarry SmithFor performing matrix operations on known Hermitian matrices (note that regular `Mat`
12479b92b1d3SBarry Smithfunctions such as `MatMult()` will of course also work on Hermitian matrices):
12489b92b1d3SBarry Smith
12499b92b1d3SBarry Smith- `MatMultHermitianTranspose()`
12509b92b1d3SBarry Smith- `MatMultHermitianTransposeAdd()` (very limited support)
12519b92b1d3SBarry Smith
12529b92b1d3SBarry Smith### How can I assemble a bunch of similar matrices?
12539b92b1d3SBarry Smith
12549b92b1d3SBarry SmithYou can first add the values common to all the matrices, then use `MatStoreValues()` to
12559b92b1d3SBarry Smithstash the common values. Each iteration you call `MatRetrieveValues()`, then set the
12569b92b1d3SBarry Smithunique values in a matrix and assemble.
12579b92b1d3SBarry Smith
12589b92b1d3SBarry Smith### Can one resize or change the size of PETSc matrices or vectors?
12599b92b1d3SBarry Smith
12609b92b1d3SBarry SmithNo, once the vector or matrices sizes have been set and the matrices or vectors are fully
12619b92b1d3SBarry Smithusable one cannot change the size of the matrices or vectors or number of processors they
12629b92b1d3SBarry Smithlive on. One may create new vectors and copy, for example using `VecScatterCreate()`,
12639b92b1d3SBarry Smiththe old values from the previous vector.
12649b92b1d3SBarry Smith
12659b92b1d3SBarry Smith### How can one compute the nullspace of a sparse matrix with MUMPS?
12669b92b1d3SBarry Smith
12679b92b1d3SBarry SmithAssuming you have an existing matrix $A$ whose nullspace $V$ you want to find:
12689b92b1d3SBarry Smith
12699b92b1d3SBarry Smith```
12709b92b1d3SBarry SmithMat      F, work, V;
12719b92b1d3SBarry SmithPetscInt N, rows;
12729b92b1d3SBarry Smith
12739b92b1d3SBarry Smith/* Determine factorability */
12749b92b1d3SBarry SmithPetscCall(MatGetFactor(A, MATSOLVERMUMPS, MAT_FACTOR_LU, &F));
12759b92b1d3SBarry SmithPetscCall(MatGetLocalSize(A, &rows, NULL));
12769b92b1d3SBarry Smith
12779b92b1d3SBarry Smith/* Set MUMPS options, see MUMPS documentation for more information */
12789b92b1d3SBarry SmithPetscCall(MatMumpsSetIcntl(F, 24, 1));
12799b92b1d3SBarry SmithPetscCall(MatMumpsSetIcntl(F, 25, 1));
12809b92b1d3SBarry Smith
12819b92b1d3SBarry Smith/* Perform factorization */
12829b92b1d3SBarry SmithPetscCall(MatLUFactorSymbolic(F, A, NULL, NULL, NULL));
12839b92b1d3SBarry SmithPetscCall(MatLUFactorNumeric(F, A, NULL));
12849b92b1d3SBarry Smith
12859b92b1d3SBarry Smith/* This is the dimension of the null space */
12869b92b1d3SBarry SmithPetscCall(MatMumpsGetInfog(F, 28, &N));
12879b92b1d3SBarry Smith
12889b92b1d3SBarry Smith/* This will contain the null space in the columns */
12899b92b1d3SBarry SmithPetscCall(MatCreateDense(comm, rows, N, PETSC_DETERMINE, PETSC_DETERMINE, NULL, &V));
12909b92b1d3SBarry Smith
12919b92b1d3SBarry SmithPetscCall(MatDuplicate(V, MAT_DO_NOT_COPY_VALUES, &work));
12929b92b1d3SBarry SmithPetscCall(MatMatSolve(F, work, V));
12939b92b1d3SBarry Smith```
12949b92b1d3SBarry Smith
12959b92b1d3SBarry Smith______________________________________________________________________
12969b92b1d3SBarry Smith
12979b92b1d3SBarry Smith## Execution
12989b92b1d3SBarry Smith
12999b92b1d3SBarry Smith### PETSc executables are SO big and take SO long to link
13009b92b1d3SBarry Smith
13019b92b1d3SBarry Smith:::{note}
13029b92b1d3SBarry SmithSee {ref}`shared libraries section <doc_faq_sharedlibs>` for more information.
13039b92b1d3SBarry Smith:::
13049b92b1d3SBarry Smith
13059b92b1d3SBarry SmithWe find this annoying as well. On most machines PETSc can use shared libraries, so
13069b92b1d3SBarry Smithexecutables should be much smaller, run `configure` with the additional option
13079b92b1d3SBarry Smith`--with-shared-libraries` (this is the default). Also, if you have room, compiling and
13089b92b1d3SBarry Smithlinking PETSc on your machine's `/tmp` disk or similar local disk, rather than over the
13099b92b1d3SBarry Smithnetwork will be much faster.
13109b92b1d3SBarry Smith
13119b92b1d3SBarry Smith### How does PETSc's `-help` option work? Why is it different for different programs?
13129b92b1d3SBarry Smith
13139b92b1d3SBarry SmithThere are 2 ways in which one interacts with the options database:
13149b92b1d3SBarry Smith
13159b92b1d3SBarry Smith- `PetscOptionsGetXXX()` where `XXX` is some type or data structure (for example
13169b92b1d3SBarry Smith  `PetscOptionsGetBool()` or `PetscOptionsGetScalarArray()`). This is a classic
13179b92b1d3SBarry Smith  "getter" function, which queries the command line options for a matching option name,
13189b92b1d3SBarry Smith  and returns the specified value.
13199b92b1d3SBarry Smith- `PetscOptionsXXX()` where `XXX` is some type or data structure (for example
13209b92b1d3SBarry Smith  `PetscOptionsBool()` or `PetscOptionsScalarArray()`). This is a so-called "provider"
13219b92b1d3SBarry Smith  function. It first records the option name in an internal list of previously encountered
13229b92b1d3SBarry Smith  options before calling `PetscOptionsGetXXX()` to query the status of said option.
13239b92b1d3SBarry Smith
13249b92b1d3SBarry SmithWhile users generally use the first option, developers will *always* use the second
13259b92b1d3SBarry Smith(provider) variant of functions. Thus, as the program runs, it will build up a list of
13269b92b1d3SBarry Smithencountered option names which are then printed **in the order of their appearance on the
13279b92b1d3SBarry Smithroot rank**. Different programs may take different paths through PETSc source code, so
13289b92b1d3SBarry Smiththey will encounter different providers, and therefore have different `-help` output.
13299b92b1d3SBarry Smith
13309b92b1d3SBarry Smith### PETSc has so many options for my program that it is hard to keep them straight
13319b92b1d3SBarry Smith
13329b92b1d3SBarry SmithRunning the PETSc program with the option `-help` will print out many of the options. To
13339b92b1d3SBarry Smithprint the options that have been specified within a program, employ `-options_left` to
13349b92b1d3SBarry Smithprint any options that the user specified but were not actually used by the program and
13359b92b1d3SBarry Smithall options used; this is helpful for detecting typo errors. The PETSc website has a search option,
13369b92b1d3SBarry Smithin the upper right hand corner, that quickly finds answers to most PETSc questions.
13379b92b1d3SBarry Smith
13389b92b1d3SBarry Smith### PETSc automatically handles many of the details in parallel PDE solvers. How can I understand what is really happening within my program?
13399b92b1d3SBarry Smith
13409b92b1d3SBarry SmithYou can use the option `-info` to get more details about the solution process. The
13419b92b1d3SBarry Smithoption `-log_view` provides details about the distribution of time spent in the various
13429b92b1d3SBarry Smithphases of the solution process. You can run with `-ts_view` or `-snes_view` or
13439b92b1d3SBarry Smith`-ksp_view` to see what solver options are being used. Run with `-ts_monitor`,
13449b92b1d3SBarry Smith`-snes_monitor`, or `-ksp_monitor` to watch convergence of the
13459b92b1d3SBarry Smithmethods. `-snes_converged_reason` and `-ksp_converged_reason` will indicate why and if
13469b92b1d3SBarry Smiththe solvers have converged.
13479b92b1d3SBarry Smith
13489b92b1d3SBarry Smith### Assembling large sparse matrices takes a long time. What can I do to make this process faster? Or MatSetValues() is so slow; what can I do to speed it up?
13499b92b1d3SBarry Smith
13509b92b1d3SBarry SmithYou probably need to do preallocation, as explained in {any}`sec_matsparse`.
13519b92b1d3SBarry SmithSee also the {ref}`performance chapter <ch_performance>` of the users manual.
13529b92b1d3SBarry Smith
13539b92b1d3SBarry SmithFor GPUs (and even CPUs) you can use `MatSetPreallocationCOO()` and `MatSetValuesCOO()` for more rapid assembly.
13549b92b1d3SBarry Smith
13559b92b1d3SBarry Smith### How can I generate performance summaries with PETSc?
13569b92b1d3SBarry Smith
13579b92b1d3SBarry SmithUse this option at runtime:
13589b92b1d3SBarry Smith
13599b92b1d3SBarry Smith-l
13609b92b1d3SBarry Smith
13619b92b1d3SBarry Smithog_view
13629b92b1d3SBarry Smith
13639b92b1d3SBarry SmithOutputs a comprehensive timing, memory consumption, and communications digest
13649b92b1d3SBarry Smithfor your program. See the {ref}`profiling chapter <ch_profiling>` of the users
13659b92b1d3SBarry Smithmanual for information on interpreting the summary data.
13669b92b1d3SBarry Smith
13679b92b1d3SBarry Smith### How do I know the amount of time spent on each level of the multigrid solver/preconditioner?
13689b92b1d3SBarry Smith
13699b92b1d3SBarry SmithRun with `-log_view` and `-pc_mg_log`
13709b92b1d3SBarry Smith
13719b92b1d3SBarry Smith### Where do I get the input matrices for the examples?
13729b92b1d3SBarry Smith
13739b92b1d3SBarry SmithSome examples use `$DATAFILESPATH/matrices/medium` and other files. These test matrices
13749b92b1d3SBarry Smithin PETSc binary format can be found in the [datafiles repository](https://gitlab.com/petsc/datafiles).
13759b92b1d3SBarry Smith
13769b92b1d3SBarry Smith### When I dump some matrices and vectors to binary, I seem to be generating some empty files with `.info` extensions. What's the deal with these?
13779b92b1d3SBarry Smith
13789b92b1d3SBarry SmithPETSc binary viewers put some additional information into `.info` files like matrix
13799b92b1d3SBarry Smithblock size. It is harmless but if you *really* don't like it you can use
13809b92b1d3SBarry Smith`-viewer_binary_skip_info` or `PetscViewerBinarySkipInfo()`.
13819b92b1d3SBarry Smith
13829b92b1d3SBarry Smith:::{note}
13839b92b1d3SBarry SmithYou need to call `PetscViewerBinarySkipInfo()` before
13849b92b1d3SBarry Smith`PetscViewerFileSetName()`. In other words you **cannot** use
13859b92b1d3SBarry Smith`PetscViewerBinaryOpen()` directly.
13869b92b1d3SBarry Smith:::
13879b92b1d3SBarry Smith
13889b92b1d3SBarry Smith### Why is my parallel solver slower than my sequential solver, or I have poor speed-up?
13899b92b1d3SBarry Smith
13909b92b1d3SBarry SmithThis can happen for many reasons:
13919b92b1d3SBarry Smith
13929b92b1d3SBarry Smith1. Make sure it is truly the time in `KSPSolve()` that is slower (by running the code
13939b92b1d3SBarry Smith   with `-log_view`). Often the slower time is in generating the matrix or some other
13949b92b1d3SBarry Smith   operation.
13959b92b1d3SBarry Smith2. There must be enough work for each process to outweigh the communication time. We
13969b92b1d3SBarry Smith   recommend an absolute minimum of about 10,000 unknowns per process, better is 20,000 or
13979b92b1d3SBarry Smith   more. This is even more true when using multiple GPUs, where you need to have millions
13989b92b1d3SBarry Smith   of unknowns per GPU.
13999b92b1d3SBarry Smith3. Make sure the {ref}`communication speed of the parallel computer
14009b92b1d3SBarry Smith   <doc_faq_general_parallel>` is good enough for parallel solvers.
14019b92b1d3SBarry Smith4. Check the number of solver iterates with the parallel solver against the sequential
14029b92b1d3SBarry Smith   solver. Most preconditioners require more iterations when used on more processes, this
14039b92b1d3SBarry Smith   is particularly true for block Jacobi (the default parallel preconditioner). You can
14049b92b1d3SBarry Smith   try `-pc_type asm` (`PCASM`) its iterations scale a bit better for more
14059b92b1d3SBarry Smith   processes. You may also consider multigrid preconditioners like `PCMG` or BoomerAMG
14069b92b1d3SBarry Smith   in `PCHYPRE`.
14079b92b1d3SBarry Smith
14089b92b1d3SBarry Smith(doc_faq_pipelined)=
14099b92b1d3SBarry Smith
14109b92b1d3SBarry Smith### What steps are necessary to make the pipelined solvers execute efficiently?
14119b92b1d3SBarry Smith
14129b92b1d3SBarry SmithPipelined solvers like `KSPPGMRES`, `KSPPIPECG`, `KSPPIPECR`, and `KSPGROPPCG` may
14139b92b1d3SBarry Smithrequire special MPI configuration to effectively overlap reductions with computation. In
14149b92b1d3SBarry Smithgeneral, this requires an MPI-3 implementation, an implementation that supports multiple
14159b92b1d3SBarry Smiththreads, and use of a "progress thread". Consult the documentation from your vendor or
14169b92b1d3SBarry Smithcomputing facility for more details.
14179b92b1d3SBarry Smith
14189b92b1d3SBarry Smith- Cray MPI MPT-5.6 MPI-3, by setting `$MPICH_MAX_THREAD_SAFETY` to "multiple"
14199b92b1d3SBarry Smith  for threads, plus either `$MPICH_ASYNC_PROGRESS` or
14209b92b1d3SBarry Smith  `$MPICH_NEMESIS_ASYNC_PROGRESS`. E.g.
14219b92b1d3SBarry Smith  ```console
14229b92b1d3SBarry Smith  $ export MPICH_MAX_THREAD_SAFETY=multiple
14239b92b1d3SBarry Smith  $ export MPICH_ASYNC_PROGRESS=1
14249b92b1d3SBarry Smith  $ export MPICH_NEMESIS_ASYNC_PROGRESS=1
14259b92b1d3SBarry Smith  ```
14269b92b1d3SBarry Smith
14279b92b1d3SBarry Smith- MPICH version 3.0 and later implements the MPI-3 standard and the default
14289b92b1d3SBarry Smith  configuration supports use of threads. Use of a progress thread is configured by
14299b92b1d3SBarry Smith  setting the environment variable `$MPICH_ASYNC_PROGRESS`. E.g.
14309b92b1d3SBarry Smith  ```console
14319b92b1d3SBarry Smith  $ export MPICH_ASYNC_PROGRESS=1
14329b92b1d3SBarry Smith  ```
14339b92b1d3SBarry Smith
14349b92b1d3SBarry Smith### When using PETSc in single precision mode (`--with-precision=single` when running `configure`) are the operations done in single or double precision?
14359b92b1d3SBarry Smith
14369b92b1d3SBarry SmithPETSc does **NOT** do any explicit conversion of single precision to double before
14379b92b1d3SBarry Smithperforming computations; it depends on the hardware and compiler for what happens. For
14389b92b1d3SBarry Smithexample, the compiler could choose to put the single precision numbers into the usual
14399b92b1d3SBarry Smithdouble precision registers and then use the usual double precision floating point unit. Or
14409b92b1d3SBarry Smithit could use SSE2 instructions that work directly on the single precision numbers. It is a
14419b92b1d3SBarry Smithbit of a mystery what decisions get made sometimes. There may be compiler flags in some
14429b92b1d3SBarry Smithcircumstances that can affect this.
14439b92b1d3SBarry Smith
14449b92b1d3SBarry Smith### Why is Newton's method (SNES) not converging, or converges slowly?
14459b92b1d3SBarry Smith
14469b92b1d3SBarry SmithNewton's method may not converge for many reasons, here are some of the most common:
14479b92b1d3SBarry Smith
14489b92b1d3SBarry Smith- The Jacobian is wrong (or correct in sequential but not in parallel).
14499b92b1d3SBarry Smith- The linear system is {ref}`not solved <doc_faq_execution_kspconv>` or is not solved
14509b92b1d3SBarry Smith  accurately enough.
14519b92b1d3SBarry Smith- The Jacobian system has a singularity that the linear solver is not handling.
14529b92b1d3SBarry Smith- There is a bug in the function evaluation routine.
14539b92b1d3SBarry Smith- The function is not continuous or does not have continuous first derivatives (e.g. phase
14549b92b1d3SBarry Smith  change or TVD limiters).
14559b92b1d3SBarry Smith- The equations may not have a solution (e.g. limit cycle instead of a steady state) or
14569b92b1d3SBarry Smith  there may be a "hill" between the initial guess and the steady state (e.g. reactants
14579b92b1d3SBarry Smith  must ignite and burn before reaching a steady state, but the steady-state residual will
14589b92b1d3SBarry Smith  be larger during combustion).
14599b92b1d3SBarry Smith
14609b92b1d3SBarry SmithHere are some of the ways to help debug lack of convergence of Newton:
14619b92b1d3SBarry Smith
14629b92b1d3SBarry Smith- Run on one processor to see if the problem is only in parallel.
14639b92b1d3SBarry Smith
14649b92b1d3SBarry Smith- Run with `-info` to get more detailed information on the solution process.
14659b92b1d3SBarry Smith
14669b92b1d3SBarry Smith- Run with the options
14679b92b1d3SBarry Smith
14689b92b1d3SBarry Smith  ```text
14699b92b1d3SBarry Smith  -snes_monitor -ksp_monitor_true_residual -snes_converged_reason -ksp_converged_reason
14709b92b1d3SBarry Smith  ```
14719b92b1d3SBarry Smith
14729b92b1d3SBarry Smith  - If the linear solve does not converge, check if the Jacobian is correct, then see
14739b92b1d3SBarry Smith    {ref}`this question <doc_faq_execution_kspconv>`.
14749b92b1d3SBarry Smith  - If the preconditioned residual converges, but the true residual does not, the
14759b92b1d3SBarry Smith    preconditioner may be singular.
14769b92b1d3SBarry Smith  - If the linear solve converges well, but the line search fails, the Jacobian may be
14779b92b1d3SBarry Smith    incorrect.
14789b92b1d3SBarry Smith
14799b92b1d3SBarry Smith- Run with `-pc_type lu` or `-pc_type svd` to see if the problem is a poor linear
14809b92b1d3SBarry Smith  solver.
14819b92b1d3SBarry Smith
14829b92b1d3SBarry Smith- Run with `-mat_view` or `-mat_view draw` to see if the Jacobian looks reasonable.
14839b92b1d3SBarry Smith
14849b92b1d3SBarry Smith- Run with `-snes_test_jacobian -snes_test_jacobian_view` to see if the Jacobian you are
14859b92b1d3SBarry Smith  using is wrong. Compare the output when you add `-mat_fd_type ds` to see if the result
14869b92b1d3SBarry Smith  is sensitive to the choice of differencing parameter.
14879b92b1d3SBarry Smith
14889b92b1d3SBarry Smith- Run with `-snes_mf_operator -pc_type lu` to see if the Jacobian you are using is
14899b92b1d3SBarry Smith  wrong. If the problem is too large for a direct solve, try
14909b92b1d3SBarry Smith
14919b92b1d3SBarry Smith  ```text
14929b92b1d3SBarry Smith  -snes_mf_operator -pc_type ksp -ksp_ksp_rtol 1e-12.
14939b92b1d3SBarry Smith  ```
14949b92b1d3SBarry Smith
14959b92b1d3SBarry Smith  Compare the output when you add `-mat_mffd_type ds` to see if the result is sensitive
14969b92b1d3SBarry Smith  to choice of differencing parameter.
14979b92b1d3SBarry Smith
14989b92b1d3SBarry Smith- Run with `-snes_linesearch_monitor` to see if the line search is failing (this is
14999b92b1d3SBarry Smith  usually a sign of a bad Jacobian). Use `-info` in PETSc 3.1 and older versions,
15009b92b1d3SBarry Smith  `-snes_ls_monitor` in PETSc 3.2 and `-snes_linesearch_monitor` in PETSc 3.3 and
15019b92b1d3SBarry Smith  later.
15029b92b1d3SBarry Smith
15039b92b1d3SBarry SmithHere are some ways to help the Newton process if everything above checks out:
15049b92b1d3SBarry Smith
15059b92b1d3SBarry Smith- Run with grid sequencing (`-snes_grid_sequence` if working with a `DM` is all you
15069b92b1d3SBarry Smith  need) to generate better initial guess on your finer mesh.
15079b92b1d3SBarry Smith
15089b92b1d3SBarry Smith- Run with quad precision, i.e.
15099b92b1d3SBarry Smith
15109b92b1d3SBarry Smith  ```console
15119b92b1d3SBarry Smith  $ ./configure --with-precision=__float128 --download-f2cblaslapack
15129b92b1d3SBarry Smith  ```
15139b92b1d3SBarry Smith
15149b92b1d3SBarry Smith  :::{note}
15159b92b1d3SBarry Smith  quad precision requires PETSc 3.2 and later and recent versions of the GNU compilers.
15169b92b1d3SBarry Smith  :::
15179b92b1d3SBarry Smith
15189b92b1d3SBarry Smith- Change the units (nondimensionalization), boundary condition scaling, or formulation so
15199b92b1d3SBarry Smith  that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and
15209b92b1d3SBarry Smith  Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127).
15219b92b1d3SBarry Smith
15229b92b1d3SBarry Smith- Mollify features in the function that do not have continuous first derivatives (often
15239b92b1d3SBarry Smith  occurs when there are "if" statements in the residual evaluation, e.g. phase change or
15249b92b1d3SBarry Smith  TVD limiters). Use a variational inequality solver (`SNESVINEWTONRSLS`) if the
15259b92b1d3SBarry Smith  discontinuities are of fundamental importance.
15269b92b1d3SBarry Smith
15279b92b1d3SBarry Smith- Try a trust region method (`-ts_type tr`, may have to adjust parameters).
15289b92b1d3SBarry Smith
15299b92b1d3SBarry Smith- Run with some continuation parameter from a point where you know the solution, see
15309b92b1d3SBarry Smith  `TSPSEUDO` for steady-states.
15319b92b1d3SBarry Smith
15329b92b1d3SBarry Smith- There are homotopy solver packages like PHCpack that can get you all possible solutions
15339b92b1d3SBarry Smith  (and tell you that it has found them all) but those are not scalable and cannot solve
15349b92b1d3SBarry Smith  anything but small problems.
15359b92b1d3SBarry Smith
15369b92b1d3SBarry Smith(doc_faq_execution_kspconv)=
15379b92b1d3SBarry Smith
15389b92b1d3SBarry Smith### Why is the linear solver (KSP) not converging, or converges slowly?
15399b92b1d3SBarry Smith
15409b92b1d3SBarry Smith:::{tip}
15419b92b1d3SBarry SmithAlways run with `-ksp_converged_reason -ksp_monitor_true_residual` when trying to
15429b92b1d3SBarry Smithlearn why a method is not converging!
15439b92b1d3SBarry Smith:::
15449b92b1d3SBarry Smith
15459b92b1d3SBarry SmithCommon reasons for KSP not converging are:
15469b92b1d3SBarry Smith
15479b92b1d3SBarry Smith- A symmetric method is being used for a non-symmetric problem.
15489b92b1d3SBarry Smith
15499b92b1d3SBarry Smith- The equations are singular by accident (e.g. forgot to impose boundary
15509b92b1d3SBarry Smith  conditions). Check this for a small problem using `-pc_type svd -pc_svd_monitor`.
15519b92b1d3SBarry Smith
15529b92b1d3SBarry Smith- The equations are intentionally singular (e.g. constant null space), but the Krylov
15539b92b1d3SBarry Smith  method was not informed, see `MatSetNullSpace()`. Always inform your local Krylov
15549b92b1d3SBarry Smith  subspace solver of any change of singularity. Failure to do so will result in the
15559b92b1d3SBarry Smith  immediate revocation of your computing and keyboard operator licenses, as well as
15569b92b1d3SBarry Smith  a stern talking-to by the nearest Krylov Subspace Method representative.
15579b92b1d3SBarry Smith
15589b92b1d3SBarry Smith- The equations are intentionally singular and `MatSetNullSpace()` was used, but the
15599b92b1d3SBarry Smith  right-hand side is not consistent. You may have to call `MatNullSpaceRemove()` on the
15609b92b1d3SBarry Smith  right-hand side before calling `KSPSolve()`. See `MatSetTransposeNullSpace()`.
15619b92b1d3SBarry Smith
15629b92b1d3SBarry Smith- The equations are indefinite so that standard preconditioners don't work. Usually you
15639b92b1d3SBarry Smith  will know this from the physics, but you can check with
15649b92b1d3SBarry Smith
15659b92b1d3SBarry Smith  ```text
15669b92b1d3SBarry Smith  -ksp_compute_eigenvalues -ksp_gmres_restart 1000 -pc_type none
15679b92b1d3SBarry Smith  ```
15689b92b1d3SBarry Smith
15699b92b1d3SBarry Smith  For simple saddle point problems, try
15709b92b1d3SBarry Smith
15719b92b1d3SBarry Smith  ```text
15729b92b1d3SBarry Smith  -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_detect_saddle_point
15739b92b1d3SBarry Smith  ```
15749b92b1d3SBarry Smith
15759b92b1d3SBarry Smith  For more difficult problems, read the literature to find robust methods and ask
15769b92b1d3SBarry Smith  <mailto:petsc-users@mcs.anl.gov> or <mailto:petsc-maint@mcs.anl.gov> if you want advice about how to
15779b92b1d3SBarry Smith  implement them.
15789b92b1d3SBarry Smith
15799b92b1d3SBarry Smith- If the method converges in preconditioned residual, but not in true residual, the
15809b92b1d3SBarry Smith  preconditioner is likely singular or nearly so. This is common for saddle point problems
15819b92b1d3SBarry Smith  (e.g. incompressible flow) or strongly nonsymmetric operators (e.g. low-Mach hyperbolic
15829b92b1d3SBarry Smith  problems with large time steps).
15839b92b1d3SBarry Smith
15849b92b1d3SBarry Smith- The preconditioner is too weak or is unstable. See if `-pc_type asm -sub_pc_type lu`
15859b92b1d3SBarry Smith  improves the convergence rate. If GMRES is losing too much progress in the restart, see
15869b92b1d3SBarry Smith  if longer restarts help `-ksp_gmres_restart 300`. If a transpose is available, try
15879b92b1d3SBarry Smith  `-ksp_type bcgs` or other methods that do not require a restart.
15889b92b1d3SBarry Smith
15899b92b1d3SBarry Smith  :::{note}
15909b92b1d3SBarry Smith  Unfortunately convergence with these methods is frequently erratic.
15919b92b1d3SBarry Smith  :::
15929b92b1d3SBarry Smith
15939b92b1d3SBarry Smith- The preconditioner is nonlinear (e.g. a nested iterative solve), try `-ksp_type
15949b92b1d3SBarry Smith  fgmres` or `-ksp_type gcr`.
15959b92b1d3SBarry Smith
15969b92b1d3SBarry Smith- You are using geometric multigrid, but some equations (often boundary conditions) are
15979b92b1d3SBarry Smith  not scaled compatibly between levels. Try `-pc_mg_galerkin` both to algebraically
15989b92b1d3SBarry Smith  construct a correctly scaled coarse operator or make sure that all the equations are
15999b92b1d3SBarry Smith  scaled in the same way if you want to use rediscretized coarse levels.
16009b92b1d3SBarry Smith
16019b92b1d3SBarry Smith- The matrix is very ill-conditioned. Check the {ref}`condition number <doc_faq_usage_condnum>`.
16029b92b1d3SBarry Smith
16039b92b1d3SBarry Smith  - Try to improve it by choosing the relative scaling of components/boundary conditions.
16049b92b1d3SBarry Smith  - Try `-ksp_diagonal_scale -ksp_diagonal_scale_fix`.
16059b92b1d3SBarry Smith  - Perhaps change the formulation of the problem to produce more friendly algebraic
16069b92b1d3SBarry Smith    equations.
16079b92b1d3SBarry Smith
16089b92b1d3SBarry Smith- Change the units (nondimensionalization), boundary condition scaling, or formulation so
16099b92b1d3SBarry Smith  that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and
16109b92b1d3SBarry Smith  Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127).
16119b92b1d3SBarry Smith
16129b92b1d3SBarry Smith- Classical Gram-Schmidt is becoming unstable, try `-ksp_gmres_modifiedgramschmidt` or
16139b92b1d3SBarry Smith  use a method that orthogonalizes differently, e.g. `-ksp_type gcr`.
16149b92b1d3SBarry Smith
16159b92b1d3SBarry Smith### I get the error message: Actual argument at (1) to assumed-type dummy is of derived type with type-bound or FINAL procedures
16169b92b1d3SBarry Smith
16179b92b1d3SBarry SmithUse the following code-snippet:
16189b92b1d3SBarry Smith
16199b92b1d3SBarry Smith```fortran
16209b92b1d3SBarry Smithmodule context_module
16219b92b1d3SBarry Smith#include petsc/finclude/petsc.h
16229b92b1d3SBarry Smithuse petsc
16239b92b1d3SBarry Smithimplicit none
16249b92b1d3SBarry Smithprivate
16259b92b1d3SBarry Smithtype, public ::  context_type
16269b92b1d3SBarry Smith  private
16279b92b1d3SBarry Smith  PetscInt :: foo
16289b92b1d3SBarry Smithcontains
16299b92b1d3SBarry Smith  procedure, public :: init => context_init
16309b92b1d3SBarry Smithend type context_type
16319b92b1d3SBarry Smithcontains
16329b92b1d3SBarry Smithsubroutine context_init(self, foo)
16339b92b1d3SBarry Smith  class(context_type), intent(in out) :: self
16349b92b1d3SBarry Smith  PetscInt, intent(in) :: foo
16359b92b1d3SBarry Smith  self%foo = foo
16369b92b1d3SBarry Smithend subroutine context_init
16379b92b1d3SBarry Smithend module context_module
16389b92b1d3SBarry Smith
16399b92b1d3SBarry Smith!------------------------------------------------------------------------
16409b92b1d3SBarry Smith
16419b92b1d3SBarry Smithprogram test_snes
16429b92b1d3SBarry Smithuse,intrinsic :: iso_c_binding
16439b92b1d3SBarry Smithuse petsc
16449b92b1d3SBarry Smithuse context_module
16459b92b1d3SBarry Smithimplicit none
16469b92b1d3SBarry Smith
16479b92b1d3SBarry SmithSNES :: snes
16489b92b1d3SBarry Smithtype(context_type),target :: context
16499b92b1d3SBarry Smithtype(c_ptr) :: contextOut
16509b92b1d3SBarry SmithPetscErrorCode :: ierr
16519b92b1d3SBarry Smith
16529b92b1d3SBarry Smithcall PetscInitialize(PETSC_NULL_CHARACTER, ierr)
16539b92b1d3SBarry Smithcall SNESCreate(PETSC_COMM_WORLD, snes, ierr)
16549b92b1d3SBarry Smithcall context%init(1)
16559b92b1d3SBarry Smith
16569b92b1d3SBarry SmithcontextOut = c_loc(context) ! contextOut is a C pointer on context
16579b92b1d3SBarry Smith
16589b92b1d3SBarry Smithcall SNESSetConvergenceTest(snes, convergence, contextOut, PETSC_NULL_FUNCTION, ierr)
16599b92b1d3SBarry Smith
16609b92b1d3SBarry Smithcall SNESDestroy(snes, ierr)
16619b92b1d3SBarry Smithcall PetscFinalize(ierr)
16629b92b1d3SBarry Smith
16639b92b1d3SBarry Smithcontains
16649b92b1d3SBarry Smith
16659b92b1d3SBarry Smithsubroutine convergence(snes, num_iterations, xnorm, pnorm,fnorm, reason, contextIn, ierr)
16669b92b1d3SBarry SmithSNES, intent(in) :: snes
16679b92b1d3SBarry Smith
16689b92b1d3SBarry SmithPetscInt, intent(in) :: num_iterations
16699b92b1d3SBarry SmithPetscReal, intent(in) :: xnorm, pnorm, fnorm
16709b92b1d3SBarry SmithSNESConvergedReason, intent(out) :: reason
16719b92b1d3SBarry Smithtype(c_ptr), intent(in out) :: contextIn
16729b92b1d3SBarry Smithtype(context_type), pointer :: context
16739b92b1d3SBarry SmithPetscErrorCode, intent(out) :: ierr
16749b92b1d3SBarry Smith
16759b92b1d3SBarry Smithcall c_f_pointer(contextIn,context)  ! convert the C pointer to a Fortran pointer to use context as in the main program
16769b92b1d3SBarry Smithreason = 0
16779b92b1d3SBarry Smithierr = 0
16789b92b1d3SBarry Smithend subroutine convergence
16799b92b1d3SBarry Smithend program test_snes
16809b92b1d3SBarry Smith```
16819b92b1d3SBarry Smith
16829b92b1d3SBarry Smith### In C++ I get a crash on `VecDestroy()` (or some other PETSc object) at the end of the program
16839b92b1d3SBarry Smith
16849b92b1d3SBarry SmithThis can happen when the destructor for a C++ class is automatically called at the end of
16859b92b1d3SBarry Smiththe program after `PetscFinalize()`. Use the following code-snippet:
16869b92b1d3SBarry Smith
16879b92b1d3SBarry Smith```
16889b92b1d3SBarry Smithmain()
16899b92b1d3SBarry Smith{
16909b92b1d3SBarry Smith  PetscCall(PetscInitialize());
16919b92b1d3SBarry Smith  {
16929b92b1d3SBarry Smith    your variables
16939b92b1d3SBarry Smith    your code
16949b92b1d3SBarry Smith
16959b92b1d3SBarry Smith    ...   /* all your destructors are called here automatically by C++ so they work correctly */
16969b92b1d3SBarry Smith  }
16979b92b1d3SBarry Smith  PetscCall(PetscFinalize());
16989b92b1d3SBarry Smith  return 0
16999b92b1d3SBarry Smith}
17009b92b1d3SBarry Smith```
17019b92b1d3SBarry Smith
17029b92b1d3SBarry Smith______________________________________________________________________
17039b92b1d3SBarry Smith
17049b92b1d3SBarry Smith## Debugging
17059b92b1d3SBarry Smith
17069b92b1d3SBarry Smith### What does the message hwloc/linux: Ignoring PCI device with non-16bit domain mean?
17079b92b1d3SBarry Smith
17089b92b1d3SBarry SmithThis is printed by the hwloc library that is used by some MPI implementations. It can be ignored.
17099b92b1d3SBarry SmithTo prevent the message from always being printed set the environmental variable `HWLOC_HIDE_ERRORS` to 2.
17109b92b1d3SBarry SmithFor example
17119b92b1d3SBarry Smith
17129b92b1d3SBarry Smith```
17139b92b1d3SBarry Smithexport HWLOC_HIDE_ERRORS=2
17149b92b1d3SBarry Smith```
17159b92b1d3SBarry Smith
17169b92b1d3SBarry Smithwhich can be added to your login profile file such as `~/.bashrc`.
17179b92b1d3SBarry Smith
17189b92b1d3SBarry Smith### How do I turn off PETSc signal handling so I can use the `-C` Option On `xlf`?
17199b92b1d3SBarry Smith
17209b92b1d3SBarry SmithImmediately after calling `PetscInitialize()` call `PetscPopSignalHandler()`.
17219b92b1d3SBarry Smith
17229b92b1d3SBarry SmithSome Fortran compilers including the IBM xlf, xlF etc compilers have a compile option
17239b92b1d3SBarry Smith(`-C` for IBM's) that causes all array access in Fortran to be checked that they are
17249b92b1d3SBarry Smithin-bounds. This is a great feature but does require that the array dimensions be set
17259b92b1d3SBarry Smithexplicitly, not with a \*.
17269b92b1d3SBarry Smith
17279b92b1d3SBarry Smith### How do I debug if `-start_in_debugger` does not work on my machine?
17289b92b1d3SBarry Smith
17299b92b1d3SBarry SmithThe script <https://github.com/Azrael3000/tmpi> can be used to run multiple MPI
17309b92b1d3SBarry Smithranks in the debugger using tmux.
17319b92b1d3SBarry Smith
17329b92b1d3SBarry SmithOn newer macOS machines - one has to be in admin group to be able to use debugger.
17339b92b1d3SBarry Smith
17349b92b1d3SBarry SmithOn newer Ubuntu linux machines - one has to disable `ptrace_scope` with
17359b92b1d3SBarry Smith
17369b92b1d3SBarry Smith```console
17379b92b1d3SBarry Smith$ sudo echo 0 > /proc/sys/kernel/yama/ptrace_scope
17389b92b1d3SBarry Smith```
17399b92b1d3SBarry Smith
17409b92b1d3SBarry Smithto get start in debugger working.
17419b92b1d3SBarry Smith
17429b92b1d3SBarry SmithIf `-start_in_debugger` does not work on your OS, for a uniprocessor job, just
17439b92b1d3SBarry Smithtry the debugger directly, for example: `gdb ex1`. You can also use [TotalView](https://totalview.io/products/totalview) which is a good graphical parallel debugger.
17449b92b1d3SBarry Smith
17459b92b1d3SBarry Smith### How do I see where my code is hanging?
17469b92b1d3SBarry Smith
17479b92b1d3SBarry SmithYou can use the `-start_in_debugger` option to start all processes in the debugger (each
17489b92b1d3SBarry Smithwill come up in its own xterm) or run in [TotalView](https://totalview.io/products/totalview). Then use `cont` (for continue) in each
17499b92b1d3SBarry Smithxterm. Once you are sure that the program is hanging, hit control-c in each xterm and then
17509b92b1d3SBarry Smithuse 'where' to print a stack trace for each process.
17519b92b1d3SBarry Smith
17529b92b1d3SBarry Smith### How can I inspect PETSc vector and matrix values when in the debugger?
17539b92b1d3SBarry Smith
17549b92b1d3SBarry SmithI will illustrate this with `gdb`, but it should be similar on other debuggers. You can
17559b92b1d3SBarry Smithlook at local `Vec` values directly by obtaining the array. For a `Vec` v, we can
17569b92b1d3SBarry Smithprint all local values using:
17579b92b1d3SBarry Smith
17589b92b1d3SBarry Smith```console
17599b92b1d3SBarry Smith(gdb) p ((Vec_Seq*) v->data)->array[0]@v->map.n
17609b92b1d3SBarry Smith```
17619b92b1d3SBarry Smith
17629b92b1d3SBarry SmithHowever, this becomes much more complicated for a matrix. Therefore, it is advisable to use the default viewer to look at the object. For a `Vec` v and a `Mat` m, this would be:
17639b92b1d3SBarry Smith
17649b92b1d3SBarry Smith```console
17659b92b1d3SBarry Smith(gdb) call VecView(v, 0)
17669b92b1d3SBarry Smith(gdb) call MatView(m, 0)
17679b92b1d3SBarry Smith```
17689b92b1d3SBarry Smith
17699b92b1d3SBarry Smithor with a communicator other than `MPI_COMM_WORLD`:
17709b92b1d3SBarry Smith
17719b92b1d3SBarry Smith```console
17729b92b1d3SBarry Smith(gdb) call MatView(m, PETSC_VIEWER_STDOUT_(m->comm))
17739b92b1d3SBarry Smith```
17749b92b1d3SBarry Smith
17759b92b1d3SBarry SmithTotalview 8.8.0+ has a new feature that allows libraries to provide their own code to
17769b92b1d3SBarry Smithdisplay objects in the debugger. Thus in theory each PETSc object, `Vec`, `Mat` etc
17779b92b1d3SBarry Smithcould have custom code to print values in the object. We have only done this for the most
17789b92b1d3SBarry Smithelementary display of `Vec` and `Mat`. See the routine `TV_display_type()` in
17799b92b1d3SBarry Smith`src/vec/vec/interface/vector.c` for an example of how these may be written. Contact us
17809b92b1d3SBarry Smithif you would like to add more.
17819b92b1d3SBarry Smith
17829b92b1d3SBarry Smith### How can I find the cause of floating point exceptions like not-a-number (NaN) or infinity?
17839b92b1d3SBarry Smith
17849b92b1d3SBarry SmithThe best way to locate floating point exceptions is to use a debugger. On supported
17859b92b1d3SBarry Smitharchitectures (including Linux and glibc-based systems), just run in a debugger and pass
17869b92b1d3SBarry Smith`-fp_trap` to the PETSc application. This will activate signaling exceptions and the
17879b92b1d3SBarry Smithdebugger will break on the line that first divides by zero or otherwise generates an
17889b92b1d3SBarry Smithexceptions.
17899b92b1d3SBarry Smith
17909b92b1d3SBarry SmithWithout a debugger, running with `-fp_trap` in debug mode will only identify the
17919b92b1d3SBarry Smithfunction in which the error occurred, but not the line or the type of exception. If
17929b92b1d3SBarry Smith`-fp_trap` is not supported on your architecture, consult the documentation for your
17939b92b1d3SBarry Smithdebugger since there is likely a way to have it catch exceptions.
17949b92b1d3SBarry Smith
17959b92b1d3SBarry Smith(error_libimf)=
17969b92b1d3SBarry Smith
17979b92b1d3SBarry Smith### Error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory
17989b92b1d3SBarry Smith
17999b92b1d3SBarry SmithThe Intel compilers use shared libraries (like libimf) that cannot be found, by default, at run
18009b92b1d3SBarry Smithtime. When using the Intel compilers (and running the resulting code) you must make sure
18019b92b1d3SBarry Smiththat the proper Intel initialization scripts are run. This is usually done by adding some
18029b92b1d3SBarry Smithcode into your `.cshrc`, `.bashrc`, `.profile` etc file. Sometimes on batch file
18039b92b1d3SBarry Smithsystems that do now access your initialization files (like .cshrc) you must include the
18049b92b1d3SBarry Smithinitialization calls in your batch file submission.
18059b92b1d3SBarry Smith
18069b92b1d3SBarry SmithFor example, on my Mac using `csh` I have the following in my `.cshrc` file:
18079b92b1d3SBarry Smith
18089b92b1d3SBarry Smith```csh
18099b92b1d3SBarry Smithsource /opt/intel/cc/10.1.012/bin/iccvars.csh
18109b92b1d3SBarry Smithsource /opt/intel/fc/10.1.012/bin/ifortvars.csh
18119b92b1d3SBarry Smithsource /opt/intel/idb/10.1.012/bin/idbvars.csh
18129b92b1d3SBarry Smith```
18139b92b1d3SBarry Smith
18149b92b1d3SBarry SmithAnd in my `.profile` I have
18159b92b1d3SBarry Smith
18169b92b1d3SBarry Smith```csh
18179b92b1d3SBarry Smithsource /opt/intel/cc/10.1.012/bin/iccvars.sh
18189b92b1d3SBarry Smithsource /opt/intel/fc/10.1.012/bin/ifortvars.sh
18199b92b1d3SBarry Smithsource /opt/intel/idb/10.1.012/bin/idbvars.sh
18209b92b1d3SBarry Smith```
18219b92b1d3SBarry Smith
18229b92b1d3SBarry Smith(object_type_not_set)=
18239b92b1d3SBarry Smith
18249b92b1d3SBarry Smith### What does "Object Type Not Set: Argument # N" Mean?
18259b92b1d3SBarry Smith
18269b92b1d3SBarry SmithMany operations on PETSc objects require that the specific type of the object be set before the operations is performed. You must call `XXXSetType()` or `XXXSetFromOptions()` before you make the offending call. For example
18279b92b1d3SBarry Smith
18289b92b1d3SBarry Smith```
18299b92b1d3SBarry SmithMat A;
18309b92b1d3SBarry Smith
18319b92b1d3SBarry SmithPetscCall(MatCreate(PETSC_COMM_WORLD, &A));
18329b92b1d3SBarry SmithPetscCall(MatSetValues(A,....));
18339b92b1d3SBarry Smith```
18349b92b1d3SBarry Smith
18359b92b1d3SBarry Smithwill not work. You must add `MatSetType()` or `MatSetFromOptions()` before the call to `MatSetValues()`. I.e.
18369b92b1d3SBarry Smith
18379b92b1d3SBarry Smith```
18389b92b1d3SBarry SmithMat A;
18399b92b1d3SBarry Smith
18409b92b1d3SBarry SmithPetscCall(MatCreate(PETSC_COMM_WORLD, &A));
18419b92b1d3SBarry Smith
18429b92b1d3SBarry SmithPetscCall(MatSetType(A, MATAIJ));
18439b92b1d3SBarry Smith/* Will override MatSetType() */
18449b92b1d3SBarry SmithPetscCall(MatSetFromOptions());
18459b92b1d3SBarry Smith
18469b92b1d3SBarry SmithPetscCall(MatSetValues(A,....));
18479b92b1d3SBarry Smith```
18489b92b1d3SBarry Smith
18499b92b1d3SBarry Smith(split_ownership)=
18509b92b1d3SBarry Smith
18519b92b1d3SBarry Smith### What does error detected in `PetscSplitOwnership()` about "sum of local lengths ...": mean?
18529b92b1d3SBarry Smith
18539b92b1d3SBarry SmithIn a previous call to `VecSetSizes()`, `MatSetSizes()`, `VecCreateXXX()` or
18549b92b1d3SBarry Smith`MatCreateXXX()` you passed in local and global sizes that do not make sense for the
18559b92b1d3SBarry Smithcorrect number of processors. For example if you pass in a local size of 2 and a global
18569b92b1d3SBarry Smithsize of 100 and run on two processors, this cannot work since the sum of the local sizes
18579b92b1d3SBarry Smithis 4, not 100.
18589b92b1d3SBarry Smith
18599b92b1d3SBarry Smith(valgrind)=
18609b92b1d3SBarry Smith
18619b92b1d3SBarry Smith### What does "corrupt argument" or "caught signal" Or "SEGV" Or "segmentation violation" Or "bus error" mean? Can I use Valgrind or CUDA-Memcheck to debug memory corruption issues?
18629b92b1d3SBarry Smith
18639b92b1d3SBarry SmithSometimes it can mean an argument to a function is invalid. In Fortran this may be caused
18649b92b1d3SBarry Smithby forgetting to list an argument in the call, especially the final `PetscErrorCode`.
18659b92b1d3SBarry Smith
18669b92b1d3SBarry SmithOtherwise it is usually caused by memory corruption; that is somewhere the code is writing
18679b92b1d3SBarry Smithout of array bounds. To track this down rerun the debug version of the code with the
18689b92b1d3SBarry Smithoption `-malloc_debug`. Occasionally the code may crash only with the optimized version,
18699b92b1d3SBarry Smithin that case run the optimized version with `-malloc_debug`. If you determine the
18709b92b1d3SBarry Smithproblem is from memory corruption you can put the macro CHKMEMQ in the code near the crash
18719b92b1d3SBarry Smithto determine exactly what line is causing the problem.
18729b92b1d3SBarry Smith
18739b92b1d3SBarry SmithIf `-malloc_debug` does not help: on NVIDIA CUDA systems you can use <https://docs.nvidia.com/cuda/cuda-memcheck/index.html>
18749b92b1d3SBarry Smith
18759b92b1d3SBarry SmithIf `-malloc_debug` does not help: on GNU/Linux (not macOS machines) - you can
18769b92b1d3SBarry Smithuse [valgrind](http://valgrind.org). Follow the below instructions:
18779b92b1d3SBarry Smith
18789b92b1d3SBarry Smith1. `configure` PETSc with `--download-mpich --with-debugging` (you can use other MPI implementations but most produce spurious Valgrind messages)
18799b92b1d3SBarry Smith
18809b92b1d3SBarry Smith2. Compile your application code with this build of PETSc.
18819b92b1d3SBarry Smith
18829b92b1d3SBarry Smith3. Run with Valgrind.
18839b92b1d3SBarry Smith
18849b92b1d3SBarry Smith   ```console
18859b92b1d3SBarry Smith   $ $PETSC_DIR/lib/petsc/bin/petscmpiexec -valgrind -n NPROC PETSCPROGRAMNAME PROGRAMOPTIONS
18869b92b1d3SBarry Smith   ```
18879b92b1d3SBarry Smith
18889b92b1d3SBarry Smith   or
18899b92b1d3SBarry Smith
18909b92b1d3SBarry Smith   ```console
18919b92b1d3SBarry Smith   $ mpiexec -n NPROC valgrind --tool=memcheck -q --num-callers=20 \
18929b92b1d3SBarry Smith   --suppressions=$PETSC_DIR/share/petsc/suppressions/valgrind \
18939b92b1d3SBarry Smith   --log-file=valgrind.log.%p PETSCPROGRAMNAME -malloc off PROGRAMOPTIONS
18949b92b1d3SBarry Smith   ```
18959b92b1d3SBarry Smith
18969b92b1d3SBarry Smith:::{note}
18979b92b1d3SBarry Smith- option `--with-debugging` enables valgrind to give stack trace with additional
18989b92b1d3SBarry Smith  source-file:line-number info.
18999b92b1d3SBarry Smith- option `--download-mpich` is valgrind clean, other MPI builds are not valgrind clean.
19009b92b1d3SBarry Smith- when `--download-mpich` is used - mpiexec will be in `$PETSC_ARCH/bin`
19019b92b1d3SBarry Smith- `--log-file=valgrind.log.%p` option tells valgrind to store the output from each
19029b92b1d3SBarry Smith  process in a different file \[as %p i.e PID, is different for each MPI process.
19039b92b1d3SBarry Smith- `memcheck` will not find certain array access that violate static array
19049b92b1d3SBarry Smith  declarations so if memcheck runs clean you can try the `--tool=exp-ptrcheck`
19059b92b1d3SBarry Smith  instead.
19069b92b1d3SBarry Smith:::
19079b92b1d3SBarry Smith
19089b92b1d3SBarry SmithYou might also consider using <http://drmemory.org> which has support for GNU/Linux, Apple
19099b92b1d3SBarry SmithMac OS and Microsoft Windows machines. (Note we haven't tried this ourselves).
19109b92b1d3SBarry Smith
19119b92b1d3SBarry Smith(zeropivot)=
19129b92b1d3SBarry Smith
19139b92b1d3SBarry Smith### What does "detected zero pivot in LU factorization" mean?
19149b92b1d3SBarry Smith
19159b92b1d3SBarry SmithA zero pivot in LU, ILU, Cholesky, or ICC sparse factorization does not always mean that
19169b92b1d3SBarry Smiththe matrix is singular. You can use
19179b92b1d3SBarry Smith
19189b92b1d3SBarry Smith```text
19199b92b1d3SBarry Smith-pc_factor_shift_type nonzero -pc_factor_shift_amount [amount]
19209b92b1d3SBarry Smith```
19219b92b1d3SBarry Smith
19229b92b1d3SBarry Smithor
19239b92b1d3SBarry Smith
19249b92b1d3SBarry Smith```text
19259b92b1d3SBarry Smith-pc_factor_shift_type positive_definite -[level]_pc_factor_shift_type nonzero
19269b92b1d3SBarry Smith -pc_factor_shift_amount [amount]
19279b92b1d3SBarry Smith```
19289b92b1d3SBarry Smith
19299b92b1d3SBarry Smithor
19309b92b1d3SBarry Smith
19319b92b1d3SBarry Smith```text
19329b92b1d3SBarry Smith-[level]_pc_factor_shift_type positive_definite
19339b92b1d3SBarry Smith```
19349b92b1d3SBarry Smith
19359b92b1d3SBarry Smithto prevent the zero pivot. [level] is "sub" when lu, ilu, cholesky, or icc are employed in
19369b92b1d3SBarry Smitheach individual block of the bjacobi or ASM preconditioner. [level] is "mg_levels" or
19379b92b1d3SBarry Smith"mg_coarse" when lu, ilu, cholesky, or icc are used inside multigrid smoothers or to the
19389b92b1d3SBarry Smithcoarse grid solver. See `PCFactorSetShiftType()`, `PCFactorSetShiftAmount()`.
19399b92b1d3SBarry Smith
19409b92b1d3SBarry SmithThis error can also happen if your matrix is singular, see `MatSetNullSpace()` for how
19419b92b1d3SBarry Smithto handle this. If this error occurs in the zeroth row of the matrix, it is likely you
19429b92b1d3SBarry Smithhave an error in the code that generates the matrix.
19439b92b1d3SBarry Smith
19449b92b1d3SBarry Smith### You create draw windows or `PETSCVIEWERDRAW` windows or use options `-ksp_monitor draw::draw_lg` or `-snes_monitor draw::draw_lg` and the program seems to run OK but windows never open
19459b92b1d3SBarry Smith
19469b92b1d3SBarry SmithThe libraries were compiled without support for X windows. Make sure that `configure`
19479b92b1d3SBarry Smithwas run with the option `--with-x`.
19489b92b1d3SBarry Smith
19499b92b1d3SBarry Smith### The program seems to use more and more memory as it runs, even though you don't think you are allocating more memory
19509b92b1d3SBarry Smith
19519b92b1d3SBarry SmithSome of the following may be occurring:
19529b92b1d3SBarry Smith
19539b92b1d3SBarry Smith- You are creating new PETSc objects but never freeing them.
19549b92b1d3SBarry Smith- There is a memory leak in PETSc or your code.
19559b92b1d3SBarry Smith- Something much more subtle: (if you are using Fortran). When you declare a large array
19569b92b1d3SBarry Smith  in Fortran, the operating system does not allocate all the memory pages for that array
19579b92b1d3SBarry Smith  until you start using the different locations in the array. Thus, in a code, if at each
19589b92b1d3SBarry Smith  step you start using later values in the array your virtual memory usage will "continue"
19599b92b1d3SBarry Smith  to increase as measured by `ps` or `top`.
19609b92b1d3SBarry Smith- You are running with the `-log`, `-log_mpe`, or `-log_all` option. With these
19619b92b1d3SBarry Smith  options, a great deal of logging information is stored in memory until the conclusion of
19629b92b1d3SBarry Smith  the run.
19639b92b1d3SBarry Smith- You are linking with the MPI profiling libraries; these cause logging of all MPI
19649b92b1d3SBarry Smith  activities. Another symptom is at the conclusion of the run it may print some message
19659b92b1d3SBarry Smith  about writing log files.
19669b92b1d3SBarry Smith
19679b92b1d3SBarry SmithThe following may help:
19689b92b1d3SBarry Smith
19699b92b1d3SBarry Smith- Run with the `-malloc_debug` option and `-malloc_view`. Or use `PetscMallocDump()`
19709b92b1d3SBarry Smith  and `PetscMallocView()` sprinkled about your code to track memory that is allocated
19719b92b1d3SBarry Smith  and not later freed. Use the commands `PetscMallocGetCurrentUsage()` and
19729b92b1d3SBarry Smith  `PetscMemoryGetCurrentUsage()` to monitor memory allocated and
19739b92b1d3SBarry Smith  `PetscMallocGetMaximumUsage()` and `PetscMemoryGetMaximumUsage()` for total memory
19749b92b1d3SBarry Smith  used as the code progresses.
19759b92b1d3SBarry Smith- This is just the way Unix works and is harmless.
19769b92b1d3SBarry Smith- Do not use the `-log`, `-log_mpe`, or `-log_all` option, or use
19779b92b1d3SBarry Smith  `PetscLogEventDeactivate()` or `PetscLogEventDeactivateClass()` to turn off logging of
19789b92b1d3SBarry Smith  specific events.
19799b92b1d3SBarry Smith- Make sure you do not link with the MPI profiling libraries.
19809b92b1d3SBarry Smith
19819b92b1d3SBarry Smith### When calling `MatPartitioningApply()` you get a message "Error! Key 16615 Not Found"
19829b92b1d3SBarry Smith
19839b92b1d3SBarry SmithThe graph of the matrix you are using is not symmetric. You must use symmetric matrices
19849b92b1d3SBarry Smithfor partitioning.
19859b92b1d3SBarry Smith
19869b92b1d3SBarry Smith### With GMRES at restart the second residual norm printed does not match the first
19879b92b1d3SBarry Smith
19889b92b1d3SBarry SmithI.e.
19899b92b1d3SBarry Smith
19909b92b1d3SBarry Smith```text
19919b92b1d3SBarry Smith26 KSP Residual norm 3.421544615851e-04
19929b92b1d3SBarry Smith27 KSP Residual norm 2.973675659493e-04
19939b92b1d3SBarry Smith28 KSP Residual norm 2.588642948270e-04
19949b92b1d3SBarry Smith29 KSP Residual norm 2.268190747349e-04
19959b92b1d3SBarry Smith30 KSP Residual norm 1.977245964368e-04
19969b92b1d3SBarry Smith30 KSP Residual norm 1.994426291979e-04 <----- At restart the residual norm is printed a second time
19979b92b1d3SBarry Smith```
19989b92b1d3SBarry Smith
19999b92b1d3SBarry SmithThis is actually not surprising! GMRES computes the norm of the residual at each iteration
20009b92b1d3SBarry Smithvia a recurrence relation between the norms of the residuals at the previous iterations
20019b92b1d3SBarry Smithand quantities computed at the current iteration. It does not compute it via directly
20029b92b1d3SBarry Smith$|| b - A x^{n} ||$.
20039b92b1d3SBarry Smith
20049b92b1d3SBarry SmithSometimes, especially with an ill-conditioned matrix, or computation of the matrix-vector
20059b92b1d3SBarry Smithproduct via differencing, the residual norms computed by GMRES start to "drift" from the
20069b92b1d3SBarry Smithcorrect values. At the restart, we compute the residual norm directly, hence the "strange
20079b92b1d3SBarry Smithstuff," the difference printed. The drifting, if it remains small, is harmless (doesn't
20089b92b1d3SBarry Smithaffect the accuracy of the solution that GMRES computes).
20099b92b1d3SBarry Smith
20109b92b1d3SBarry SmithIf you use a more powerful preconditioner the drift will often be smaller and less
20119b92b1d3SBarry Smithnoticeable. Of if you are running matrix-free you may need to tune the matrix-free
20129b92b1d3SBarry Smithparameters.
20139b92b1d3SBarry Smith
20149b92b1d3SBarry Smith### Why do some Krylov methods seem to print two residual norms per iteration?
20159b92b1d3SBarry Smith
20169b92b1d3SBarry SmithI.e.
20179b92b1d3SBarry Smith
20189b92b1d3SBarry Smith```text
20199b92b1d3SBarry Smith1198 KSP Residual norm 1.366052062216e-04
20209b92b1d3SBarry Smith1198 KSP Residual norm 1.931875025549e-04
20219b92b1d3SBarry Smith1199 KSP Residual norm 1.366026406067e-04
20229b92b1d3SBarry Smith1199 KSP Residual norm 1.931819426344e-04
20239b92b1d3SBarry Smith```
20249b92b1d3SBarry Smith
20259b92b1d3SBarry SmithSome Krylov methods, for example `KSPTFQMR`, actually have a "sub-iteration" of size 2
20269b92b1d3SBarry Smithinside the loop. Each of the two substeps has its own matrix vector product and
20279b92b1d3SBarry Smithapplication of the preconditioner and updates the residual approximations. This is why you
20289b92b1d3SBarry Smithget this "funny" output where it looks like there are two residual norms per
20299b92b1d3SBarry Smithiteration. You can also think of it as twice as many iterations.
20309b92b1d3SBarry Smith
20319b92b1d3SBarry Smith### Unable to locate PETSc dynamic library `libpetsc`
20329b92b1d3SBarry Smith
20339b92b1d3SBarry SmithWhen using DYNAMIC libraries - the libraries cannot be moved after they are
20349b92b1d3SBarry Smithinstalled. This could also happen on clusters - where the paths are different on the (run)
20359b92b1d3SBarry Smithnodes - than on the (compile) front-end. **Do not use dynamic libraries & shared
20369b92b1d3SBarry Smithlibraries**. Run `configure` with
20379b92b1d3SBarry Smith`--with-shared-libraries=0 --with-dynamic-loading=0`.
20389b92b1d3SBarry Smith
20399b92b1d3SBarry Smith:::{important}
2040f0b74427SPierre JolivetThis option has been removed in PETSc v3.5
20419b92b1d3SBarry Smith:::
20429b92b1d3SBarry Smith
20439b92b1d3SBarry Smith### How do I determine what update to PETSc broke my code?
20449b92b1d3SBarry Smith
20459b92b1d3SBarry SmithIf at some point (in PETSc code history) you had a working code - but the latest PETSc
20469b92b1d3SBarry Smithcode broke it, its possible to determine the PETSc code change that might have caused this
20479b92b1d3SBarry Smithbehavior. This is achieved by:
20489b92b1d3SBarry Smith
20499b92b1d3SBarry Smith- Using Git to access PETSc sources
20509b92b1d3SBarry Smith- Knowing the Git commit for the known working version of PETSc
20519b92b1d3SBarry Smith- Knowing the Git commit for the known broken version of PETSc
20529b92b1d3SBarry Smith- Using the [bisect](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html)
20539b92b1d3SBarry Smith  functionality of Git
20549b92b1d3SBarry Smith
20559b92b1d3SBarry SmithGit bisect can be done as follows:
20569b92b1d3SBarry Smith
20579b92b1d3SBarry Smith1. Get PETSc development (main branch in git) sources
20589b92b1d3SBarry Smith
20599b92b1d3SBarry Smith   ```console
20609b92b1d3SBarry Smith   $ git clone https://gitlab.com/petsc/petsc.git
20619b92b1d3SBarry Smith   ```
20629b92b1d3SBarry Smith
20639b92b1d3SBarry Smith2. Find the good and bad markers to start the bisection process. This can be done either
20649b92b1d3SBarry Smith   by checking `git log` or `gitk` or <https://gitlab.com/petsc/petsc> or the web
20659b92b1d3SBarry Smith   history of petsc-release clones. Lets say the known bad commit is 21af4baa815c and
20669b92b1d3SBarry Smith   known good commit is 5ae5ab319844.
20679b92b1d3SBarry Smith
20689b92b1d3SBarry Smith3. Start the bisection process with these known revisions. Build PETSc, and test your code
20699b92b1d3SBarry Smith   to confirm known good/bad behavior:
20709b92b1d3SBarry Smith
20719b92b1d3SBarry Smith   ```console
20729b92b1d3SBarry Smith   $ git bisect start 21af4baa815c 5ae5ab319844
20739b92b1d3SBarry Smith   ```
20749b92b1d3SBarry Smith
20759b92b1d3SBarry Smith   build/test, perhaps discover that this new state is bad
20769b92b1d3SBarry Smith
20779b92b1d3SBarry Smith   ```console
20789b92b1d3SBarry Smith   $ git bisect bad
20799b92b1d3SBarry Smith   ```
20809b92b1d3SBarry Smith
20819b92b1d3SBarry Smith   build/test, perhaps discover that this state is good
20829b92b1d3SBarry Smith
20839b92b1d3SBarry Smith   ```console
20849b92b1d3SBarry Smith   $ git bisect good
20859b92b1d3SBarry Smith   ```
20869b92b1d3SBarry Smith
20879b92b1d3SBarry Smith   Now until done - keep bisecting, building PETSc, and testing your code with it and
20889b92b1d3SBarry Smith   determine if the code is working or not. After something like 5-15 iterations, `git
20899b92b1d3SBarry Smith   bisect` will pin-point the exact code change that resulted in the difference in
20909b92b1d3SBarry Smith   application behavior.
20919b92b1d3SBarry Smith
20929b92b1d3SBarry Smith:::{tip}
20939b92b1d3SBarry SmithSee [git-bisect(1)](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html) and the
20949b92b1d3SBarry Smith[debugging section of the Git Book](https://git-scm.com/book/en/Git-Tools-Debugging-with-Git) for more debugging tips.
20959b92b1d3SBarry Smith:::
20969b92b1d3SBarry Smith
20979b92b1d3SBarry Smith### How to fix the error "PMIX Error: error in file gds_ds12_lock_pthread.c"?
20989b92b1d3SBarry Smith
20999b92b1d3SBarry SmithThis seems to be an error when using Open MPI and OpenBLAS with threads (or perhaps other
21009b92b1d3SBarry Smithpackages that use threads).
21019b92b1d3SBarry Smith
21029b92b1d3SBarry Smith```console
21039b92b1d3SBarry Smith$ export PMIX_MCA_gds=hash
21049b92b1d3SBarry Smith```
21059b92b1d3SBarry Smith
21069b92b1d3SBarry SmithShould resolve the problem.
21079b92b1d3SBarry Smith
21089b92b1d3SBarry Smith______________________________________________________________________
21099b92b1d3SBarry Smith
21109b92b1d3SBarry Smith(doc_faq_sharedlibs)=
21119b92b1d3SBarry Smith
21129b92b1d3SBarry Smith## Shared Libraries
21139b92b1d3SBarry Smith
21149b92b1d3SBarry Smith### Can I install PETSc libraries as shared libraries?
21159b92b1d3SBarry Smith
21169b92b1d3SBarry SmithYes. Use
21179b92b1d3SBarry Smith
21189b92b1d3SBarry Smith```console
21199b92b1d3SBarry Smith$ ./configure --with-shared-libraries
21209b92b1d3SBarry Smith```
21219b92b1d3SBarry Smith
21229b92b1d3SBarry Smith### Why should I use shared libraries?
21239b92b1d3SBarry Smith
21249b92b1d3SBarry SmithWhen you link to shared libraries, the function symbols from the shared libraries are not
21259b92b1d3SBarry Smithcopied in the executable. This way the size of the executable is considerably smaller than
21269b92b1d3SBarry Smithwhen using regular libraries. This helps in a couple of ways:
21279b92b1d3SBarry Smith
21289b92b1d3SBarry Smith- Saves disk space when more than one executable is created
21299b92b1d3SBarry Smith- Improves the compile time immensely, because the compiler has to write a much smaller
21309b92b1d3SBarry Smith  file (executable) to the disk.
21319b92b1d3SBarry Smith
21329b92b1d3SBarry Smith### How do I link to the PETSc shared libraries?
21339b92b1d3SBarry Smith
21349b92b1d3SBarry SmithBy default, the compiler should pick up the shared libraries instead of the regular
21359b92b1d3SBarry Smithones. Nothing special should be done for this.
21369b92b1d3SBarry Smith
21379b92b1d3SBarry Smith### What if I want to link to the regular `.a` library files?
21389b92b1d3SBarry Smith
21399b92b1d3SBarry SmithYou must run `configure` with the option `--with-shared-libraries=0` (you can use a
21409b92b1d3SBarry Smithdifferent `$PETSC_ARCH` for this build so you can easily switch between the two).
21419b92b1d3SBarry Smith
21429b92b1d3SBarry Smith### What do I do if I want to move my executable to a different machine?
21439b92b1d3SBarry Smith
21449b92b1d3SBarry SmithYou would also need to have access to the shared libraries on this new machine. The other
21459b92b1d3SBarry Smithalternative is to build the executable without shared libraries by first deleting the
21469b92b1d3SBarry Smithshared libraries, and then creating the executable.
21479b92b1d3SBarry Smith
21489b92b1d3SBarry Smith```{bibliography} /petsc.bib
21499b92b1d3SBarry Smith:filter: docname in docnames
21509b92b1d3SBarry Smith```
2151