19b92b1d3SBarry Smith(doc_faq)= 29b92b1d3SBarry Smith 39b92b1d3SBarry Smith# FAQ 49b92b1d3SBarry Smith 59b92b1d3SBarry Smith```{contents} Table Of Contents 69b92b1d3SBarry Smith:backlinks: top 79b92b1d3SBarry Smith:local: true 89b92b1d3SBarry Smith``` 99b92b1d3SBarry Smith 109b92b1d3SBarry Smith______________________________________________________________________ 119b92b1d3SBarry Smith 129b92b1d3SBarry Smith## General 139b92b1d3SBarry Smith 149b92b1d3SBarry Smith### How can I subscribe to the PETSc mailing lists? 159b92b1d3SBarry Smith 169b92b1d3SBarry SmithSee mailing list {ref}`documentation <doc_mail>` 179b92b1d3SBarry Smith 189b92b1d3SBarry Smith### Any useful books on numerical computing? 199b92b1d3SBarry Smith 209b92b1d3SBarry Smith[Bueler, PETSc for Partial Differential Equations: Numerical Solutions in C and Python](https://my.siam.org/Store/Product/viewproduct/?ProductId=32850137) 219b92b1d3SBarry Smith 229b92b1d3SBarry Smith[Oliveira and Stewart, Writing Scientific Software: A Guide to Good Style](https://www.cambridge.org/core/books/writing-scientific-software/23206704175AF868E43FE3FB399C2F53) 239b92b1d3SBarry Smith 249b92b1d3SBarry Smith(doc_faq_general_parallel)= 259b92b1d3SBarry Smith 269b92b1d3SBarry Smith### What kind of parallel computers or clusters are needed to use PETSc? Or why do I get little speedup? 279b92b1d3SBarry Smith 289b92b1d3SBarry Smith:::{important} 299b92b1d3SBarry SmithPETSc can be used with any kind of parallel system that supports MPI BUT for any decent 309b92b1d3SBarry Smithperformance one needs: 319b92b1d3SBarry Smith 329b92b1d3SBarry Smith- Fast, **low-latency** interconnect; any ethernet (even 10 GigE) simply cannot provide 339b92b1d3SBarry Smith the needed performance. 349b92b1d3SBarry Smith- High per-core **memory** performance. Each core needs to 359b92b1d3SBarry Smith have its **own** memory bandwidth of at least 2 or more gigabytes/second. Most modern 369b92b1d3SBarry Smith computers are not bottlenecked by how fast they can perform 379b92b1d3SBarry Smith calculations; rather, they are usually restricted by how quickly they can get their 389b92b1d3SBarry Smith data. 399b92b1d3SBarry Smith::: 409b92b1d3SBarry Smith 419b92b1d3SBarry SmithTo obtain good performance it is important that you know your machine! I.e. how many 429b92b1d3SBarry Smithcompute nodes (generally, how many motherboards), how many memory sockets per node and how 439b92b1d3SBarry Smithmany cores per memory socket and how much memory bandwidth for each. 449b92b1d3SBarry Smith 459b92b1d3SBarry SmithIf you do not know this and can run MPI programs with mpiexec (that is, you don't have 469b92b1d3SBarry Smithbatch system), run the following from `$PETSC_DIR`: 479b92b1d3SBarry Smith 489b92b1d3SBarry Smith```console 499b92b1d3SBarry Smith$ make streams [NPMAX=maximum_number_of_mpi_processes_you_plan_to_use] 509b92b1d3SBarry Smith``` 519b92b1d3SBarry Smith 529b92b1d3SBarry SmithThis will provide a summary of the bandwidth with different number of MPI 539b92b1d3SBarry Smithprocesses and potential speedups. See {any}`ch_streams` for a detailed discussion. 549b92b1d3SBarry Smith 559b92b1d3SBarry SmithIf you have a batch system: 569b92b1d3SBarry Smith 579b92b1d3SBarry Smith```console 589b92b1d3SBarry Smith$ cd $PETSC_DIR/src/benchmarks/streams 599b92b1d3SBarry Smith$ make MPIVersion 609b92b1d3SBarry Smithsubmit MPIVersion to the batch system a number of times with 1, 2, 3, etc. MPI processes 619b92b1d3SBarry Smithcollecting all of the output from the runs into the single file scaling.log. Copy 629b92b1d3SBarry Smithscaling.log into the src/benchmarks/streams directory. 639b92b1d3SBarry Smith$ ./process.py createfile ; process.py 649b92b1d3SBarry Smith``` 659b92b1d3SBarry Smith 669b92b1d3SBarry SmithEven if you have enough memory bandwidth if the OS switches processes between cores 679b92b1d3SBarry Smithperformance can degrade. Smart process to core/socket binding (this just means locking a 689b92b1d3SBarry Smithprocess to a particular core or memory socket) may help you. For example, consider using 699b92b1d3SBarry Smithfewer processes than cores and binding processes to separate sockets so that each process 709b92b1d3SBarry Smithuses a different memory bus: 719b92b1d3SBarry Smith 729b92b1d3SBarry Smith- [MPICH2 binding with the Hydra process manager](https://github.com/pmodels/mpich/blob/main/doc/wiki/how_to/Using_the_Hydra_Process_Manager.md#process-core-binding] 739b92b1d3SBarry Smith 749b92b1d3SBarry Smith ```console 759b92b1d3SBarry Smith $ mpiexec.hydra -n 4 --binding cpu:sockets 769b92b1d3SBarry Smith ``` 779b92b1d3SBarry Smith 789b92b1d3SBarry Smith- [Open MPI binding](https://www.open-mpi.org/faq/?category=tuning#using-paffinity) 799b92b1d3SBarry Smith 809b92b1d3SBarry Smith ```console 819b92b1d3SBarry Smith $ mpiexec -n 4 --map-by socket --bind-to socket --report-bindings 829b92b1d3SBarry Smith ``` 839b92b1d3SBarry Smith 849b92b1d3SBarry Smith- `taskset`, part of the [util-linux](https://github.com/karelzak/util-linux) package 859b92b1d3SBarry Smith 869b92b1d3SBarry Smith Check `man taskset` for details. Make sure to set affinity for **your** program, 879b92b1d3SBarry Smith **not** for the `mpiexec` program. 889b92b1d3SBarry Smith 899b92b1d3SBarry Smith- `numactl` 909b92b1d3SBarry Smith 919b92b1d3SBarry Smith In addition to task affinity, this tool also allows changing the default memory affinity 929b92b1d3SBarry Smith policy. On Linux, the default policy is to attempt to find memory on the same memory bus 939b92b1d3SBarry Smith that serves the core that a thread is running on when the memory is faulted 949b92b1d3SBarry Smith (not when `malloc()` is called). If local memory is not available, it is found 959b92b1d3SBarry Smith elsewhere, possibly leading to serious memory imbalances. 969b92b1d3SBarry Smith 979b92b1d3SBarry Smith The option `--localalloc` allocates memory on the local NUMA node, similar to the 989b92b1d3SBarry Smith `numa_alloc_local()` function in the `libnuma` library. The option 999b92b1d3SBarry Smith `--cpunodebind=nodes` binds the process to a given NUMA node (note that this can be 1009b92b1d3SBarry Smith larger or smaller than a CPU (socket); a NUMA node usually has multiple cores). 1019b92b1d3SBarry Smith 1029b92b1d3SBarry Smith The option `--physcpubind=cpus` binds the process to a given processor core (numbered 1039b92b1d3SBarry Smith according to `/proc/cpuinfo`, therefore including logical cores if Hyper-threading is 1049b92b1d3SBarry Smith enabled). 1059b92b1d3SBarry Smith 1069b92b1d3SBarry Smith With Open MPI, you can use knowledge of the NUMA hierarchy and core numbering on your 1079b92b1d3SBarry Smith machine to calculate the correct NUMA node or processor number given the environment 1089b92b1d3SBarry Smith variable `OMPI_COMM_WORLD_LOCAL_RANK`. In most cases, it is easier to make mpiexec or 1099b92b1d3SBarry Smith a resource manager set affinities. 1109b92b1d3SBarry Smith 1119b92b1d3SBarry SmithThe software [Open-MX](http://open-mx.gforge.inria.fr) provides faster speed for 1129b92b1d3SBarry Smithethernet systems, we have not tried it but it claims it can dramatically reduce latency 1139b92b1d3SBarry Smithand increase bandwidth on Linux system. You must first install this software and then 1149b92b1d3SBarry Smithinstall MPICH or Open MPI to use it. 1159b92b1d3SBarry Smith 1169b92b1d3SBarry Smith### What kind of license is PETSc released under? 1179b92b1d3SBarry Smith 1189b92b1d3SBarry SmithSee licensing {ref}`documentation <doc_license>` 1199b92b1d3SBarry Smith 1209b92b1d3SBarry Smith### Why is PETSc written in C, instead of Fortran or C++? 1219b92b1d3SBarry Smith 1229b92b1d3SBarry SmithWhen this decision was made, in the early 1990s, C enabled us to build data structures 1239b92b1d3SBarry Smithfor storing sparse matrices, solver information, 1249b92b1d3SBarry Smithetc. in ways that Fortran simply did not allow. ANSI C was a complete standard that all 1259b92b1d3SBarry Smithmodern C compilers supported. The language was identical on all machines. C++ was still 1269b92b1d3SBarry Smithevolving and compilers on different machines were not identical. Using C function pointers 1279b92b1d3SBarry Smithto provide data encapsulation and polymorphism allowed us to get many of the advantages of 1289b92b1d3SBarry SmithC++ without using such a large and more complicated language. It would have been natural and 1299b92b1d3SBarry Smithreasonable to have coded PETSc in C++; we opted to use C instead. 1309b92b1d3SBarry Smith 1319b92b1d3SBarry Smith### Does all the PETSc error checking and logging reduce PETSc's efficiency? 1329b92b1d3SBarry Smith 1339b92b1d3SBarry SmithNo 1349b92b1d3SBarry Smith 1359b92b1d3SBarry Smith(doc_faq_maintenance_strats)= 1369b92b1d3SBarry Smith 1379b92b1d3SBarry Smith### How do such a small group of people manage to write and maintain such a large and marvelous package as PETSc? 1389b92b1d3SBarry Smith 1399b92b1d3SBarry Smith1. **We work very efficiently**. 1409b92b1d3SBarry Smith 1419b92b1d3SBarry Smith - We use powerful editors and programming environments. 1429b92b1d3SBarry Smith - Our manual pages are generated automatically from formatted comments in the code, 1439b92b1d3SBarry Smith thus alleviating the need for creating and maintaining manual pages. 1449b92b1d3SBarry Smith - We employ continuous integration testing of the entire PETSc library on many different 1459b92b1d3SBarry Smith machine architectures. This process **significantly** protects (no bug-catching 1469b92b1d3SBarry Smith process is perfect) against inadvertently introducing bugs with new additions. Every 1479b92b1d3SBarry Smith new feature **must** pass our suite of thousands of tests as well as formal code 1489b92b1d3SBarry Smith review before it may be included. 1499b92b1d3SBarry Smith 1509b92b1d3SBarry Smith2. **We are very careful in our design (and are constantly revising our design)** 1519b92b1d3SBarry Smith 1529b92b1d3SBarry Smith - PETSc as a package should be easy to use, write, and maintain. Our mantra is to write 1539b92b1d3SBarry Smith code like everyone is using it. 1549b92b1d3SBarry Smith 1559b92b1d3SBarry Smith3. **We are willing to do the grunt work** 1569b92b1d3SBarry Smith 1579b92b1d3SBarry Smith - PETSc is regularly checked to make sure that all code conforms to our interface 1589b92b1d3SBarry Smith design. We will never keep in a bad design decision simply because changing it will 1599b92b1d3SBarry Smith require a lot of editing; we do a lot of editing. 1609b92b1d3SBarry Smith 1619b92b1d3SBarry Smith4. **We constantly seek out and experiment with new design ideas** 1629b92b1d3SBarry Smith 1639b92b1d3SBarry Smith - We retain the useful ones and discard the rest. All of these decisions are based not 1649b92b1d3SBarry Smith just on performance, but also on **practicality**. 1659b92b1d3SBarry Smith 1669b92b1d3SBarry Smith5. **Function and variable names must adhere to strict guidelines** 1679b92b1d3SBarry Smith 1689b92b1d3SBarry Smith - Even the rules about capitalization are designed to make it easy to figure out the 1699b92b1d3SBarry Smith name of a particular object or routine. Our memories are terrible, so careful 1709b92b1d3SBarry Smith consistent naming puts less stress on our limited human RAM. 1719b92b1d3SBarry Smith 1729b92b1d3SBarry Smith6. **The PETSc directory tree is designed to make it easy to move throughout the 1739b92b1d3SBarry Smith entire package** 1749b92b1d3SBarry Smith 1759b92b1d3SBarry Smith7. **We have a rich, robust, and fast bug reporting system** 1769b92b1d3SBarry Smith 1779b92b1d3SBarry Smith - <mailto:petsc-maint@mcs.anl.gov> is always checked, and we pride ourselves on responding 1789b92b1d3SBarry Smith quickly and accurately. Email is very lightweight, and so bug reports system retains 1799b92b1d3SBarry Smith an archive of all reported problems and fixes, so it is easy to re-find fixes to 1809b92b1d3SBarry Smith previously discovered problems. 1819b92b1d3SBarry Smith 1829b92b1d3SBarry Smith8. **We contain the complexity of PETSc by using powerful object-oriented programming 1839b92b1d3SBarry Smith techniques** 1849b92b1d3SBarry Smith 1859b92b1d3SBarry Smith - Data encapsulation serves to abstract complex data formats or movement to 1869b92b1d3SBarry Smith human-readable format. This is why your program cannot, for example, look directly 1879b92b1d3SBarry Smith at what is inside the object `Mat`. 1889b92b1d3SBarry Smith - Polymorphism makes changing program behavior as easy as possible, and further 1899b92b1d3SBarry Smith abstracts the *intent* of your program from what is *written* in code. You call 1909b92b1d3SBarry Smith `MatMult()` regardless of whether your matrix is dense, sparse, parallel or 1919b92b1d3SBarry Smith sequential; you don't call a different routine for each format. 1929b92b1d3SBarry Smith 1939b92b1d3SBarry Smith9. **We try to provide the functionality requested by our users** 1949b92b1d3SBarry Smith 1959b92b1d3SBarry Smith### For complex numbers will I get better performance with C++? 1969b92b1d3SBarry Smith 1979b92b1d3SBarry SmithTo use PETSc with complex numbers you may use the following `configure` options; 1989b92b1d3SBarry Smith`--with-scalar-type=complex` and either `--with-clanguage=c++` or (the default) 1999b92b1d3SBarry Smith`--with-clanguage=c`. In our experience they will deliver very similar performance 2009b92b1d3SBarry Smith(speed), but if one is concerned they should just try both and see if one is faster. 2019b92b1d3SBarry Smith 2029b92b1d3SBarry Smith### How come when I run the same program on the same number of processes I get a "different" answer? 2039b92b1d3SBarry Smith 2049b92b1d3SBarry SmithInner products and norms in PETSc are computed using the `MPI_Allreduce()` command. For 2059b92b1d3SBarry Smithdifferent runs the order at which values arrive at a given process (via MPI) can be in a 2069b92b1d3SBarry Smithdifferent order, thus the order in which some floating point arithmetic operations are 2079b92b1d3SBarry Smithperformed will be different. Since floating point arithmetic is not 2089b92b1d3SBarry Smithassociative, the computed quantity may be slightly different. 2099b92b1d3SBarry Smith 2109b92b1d3SBarry SmithOver a run the many slight differences in the inner products and norms will effect all the 2119b92b1d3SBarry Smithcomputed results. It is important to realize that none of the computed answers are any 2129b92b1d3SBarry Smithless right or wrong (in fact the sequential computation is no more right then the parallel 2139b92b1d3SBarry Smithones). All answers are equal, but some are more equal than others. 2149b92b1d3SBarry Smith 2159b92b1d3SBarry SmithThe discussion above assumes that the exact same algorithm is being used on the different 2169b92b1d3SBarry Smithnumber of processes. When the algorithm is different for the different number of processes 2179b92b1d3SBarry Smith(almost all preconditioner algorithms except Jacobi are different for different number of 2189b92b1d3SBarry Smithprocesses) then one expects to see (and does) a greater difference in results for 2199b92b1d3SBarry Smithdifferent numbers of processes. In some cases (for example block Jacobi preconditioner) it 2209b92b1d3SBarry Smithmay be that the algorithm works for some number of processes and does not work for others. 2219b92b1d3SBarry Smith 2229b92b1d3SBarry Smith### How come when I run the same linear solver on a different number of processes it takes a different number of iterations? 2239b92b1d3SBarry Smith 2249b92b1d3SBarry SmithThe convergence of many of the preconditioners in PETSc including the default parallel 2259b92b1d3SBarry Smithpreconditioner block Jacobi depends on the number of processes. The more processes the 2269b92b1d3SBarry Smith(slightly) slower convergence it has. This is the nature of iterative solvers, the more 2279b92b1d3SBarry Smithparallelism means the more "older" information is used in the solution process hence 2289b92b1d3SBarry Smithslower convergence. 2299b92b1d3SBarry Smith 2309b92b1d3SBarry Smith(doc_faq_gpuhowto)= 2319b92b1d3SBarry Smith 2329b92b1d3SBarry Smith### Can PETSc use GPUs to speed up computations? 2339b92b1d3SBarry Smith 2349b92b1d3SBarry Smith:::{seealso} 2359b92b1d3SBarry SmithSee GPU development {ref}`roadmap <doc_gpu_roadmap>` for the latest information 2369b92b1d3SBarry Smithregarding the state of PETSc GPU integration. 2379b92b1d3SBarry Smith 2389b92b1d3SBarry SmithSee GPU install {ref}`documentation <doc_config_accel>` for up-to-date information on 2399b92b1d3SBarry Smithinstalling PETSc to use GPU's. 2409b92b1d3SBarry Smith::: 2419b92b1d3SBarry Smith 2429b92b1d3SBarry SmithQuick summary of usage with CUDA: 2439b92b1d3SBarry Smith 2449b92b1d3SBarry Smith- The `VecType` `VECSEQCUDA`, `VECMPICUDA`, or `VECCUDA` may be used with 2459b92b1d3SBarry Smith `VecSetType()` or `-vec_type seqcuda`, `mpicuda`, or `cuda` when 2469b92b1d3SBarry Smith `VecSetFromOptions()` is used. 2479b92b1d3SBarry Smith- The `MatType` `MATSEQAIJCUSPARSE`, `MATMPIAIJCUSPARSE`, or `MATAIJCUSPARSE` 2489b92b1d3SBarry Smith may be used with `MatSetType()` or `-mat_type seqaijcusparse`, `mpiaijcusparse`, or 2499b92b1d3SBarry Smith `aijcusparse` when `MatSetFromOptions()` is used. 2509b92b1d3SBarry Smith- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type 2519b92b1d3SBarry Smith cuda` and `-dm_mat_type aijcusparse`. 2529b92b1d3SBarry Smith 2539b92b1d3SBarry SmithQuick summary of usage with OpenCL (provided by the ViennaCL library): 2549b92b1d3SBarry Smith 2559b92b1d3SBarry Smith- The `VecType` `VECSEQVIENNACL`, `VECMPIVIENNACL`, or `VECVIENNACL` may be used 2569b92b1d3SBarry Smith with `VecSetType()` or `-vec_type seqviennacl`, `mpiviennacl`, or `viennacl` 2579b92b1d3SBarry Smith when `VecSetFromOptions()` is used. 2589b92b1d3SBarry Smith- The `MatType` `MATSEQAIJVIENNACL`, `MATMPIAIJVIENNACL`, or `MATAIJVIENNACL` 2599b92b1d3SBarry Smith may be used with `MatSetType()` or `-mat_type seqaijviennacl`, `mpiaijviennacl`, or 2609b92b1d3SBarry Smith `aijviennacl` when `MatSetFromOptions()` is used. 2619b92b1d3SBarry Smith- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type 2629b92b1d3SBarry Smith viennacl` and `-dm_mat_type aijviennacl`. 2639b92b1d3SBarry Smith 2649b92b1d3SBarry SmithGeneral hints: 2659b92b1d3SBarry Smith 2669b92b1d3SBarry Smith- It is useful to develop your code with the default vectors and then run production runs 2679b92b1d3SBarry Smith with the command line options to use the GPU since debugging on GPUs is difficult. 2689b92b1d3SBarry Smith- All of the Krylov methods except `KSPIBCGS` run on the GPU. 2699b92b1d3SBarry Smith- Parts of most preconditioners run directly on the GPU. After setup, `PCGAMG` runs 2709b92b1d3SBarry Smith fully on GPUs, without any memory copies between the CPU and GPU. 2719b92b1d3SBarry Smith 2729b92b1d3SBarry SmithSome GPU systems (for example many laptops) only run with single precision; thus, PETSc 2739b92b1d3SBarry Smithmust be built with the `configure` option `--with-precision=single`. 2749b92b1d3SBarry Smith 2759b92b1d3SBarry Smith(doc_faq_extendedprecision)= 2769b92b1d3SBarry Smith 2779b92b1d3SBarry Smith### Can I run PETSc with extended precision? 2789b92b1d3SBarry Smith 2799b92b1d3SBarry SmithYes, with gcc and gfortran. `configure` PETSc using the 2809b92b1d3SBarry Smithoptions `--with-precision=__float128` and `--download-f2cblaslapack`. 2819b92b1d3SBarry Smith 2829b92b1d3SBarry Smith:::{admonition} Warning 2839b92b1d3SBarry Smith:class: yellow 2849b92b1d3SBarry Smith 2859b92b1d3SBarry SmithExternal packages are not guaranteed to work in this mode! 2869b92b1d3SBarry Smith::: 2879b92b1d3SBarry Smith 2889b92b1d3SBarry Smith### Why doesn't PETSc use Qd to implement support for extended precision? 2899b92b1d3SBarry Smith 2909b92b1d3SBarry SmithWe tried really hard but could not. The problem is that the QD c++ classes, though they 2919b92b1d3SBarry Smithtry to, implement the built-in data types of `double` are not native types and cannot 2929b92b1d3SBarry Smith"just be used" in a general piece of numerical source code. Ratherm the code has to 2939b92b1d3SBarry Smithrewritten to live within the limitations of QD classes. However PETSc can be built to use 2949b92b1d3SBarry Smithquad precision, as detailed {ref}`here <doc_faq_extendedprecision>`. 2959b92b1d3SBarry Smith 2969b92b1d3SBarry Smith### How do I cite PETSc? 2979b92b1d3SBarry Smith 2989b92b1d3SBarry SmithUse {any}`these citations <doc_index_citing_petsc>`. 2999b92b1d3SBarry Smith 3009b92b1d3SBarry Smith______________________________________________________________________ 3019b92b1d3SBarry Smith 3029b92b1d3SBarry Smith## Installation 3039b92b1d3SBarry Smith 3049b92b1d3SBarry Smith### How do I begin using PETSc if the software has already been completely built and installed by someone else? 3059b92b1d3SBarry Smith 3069b92b1d3SBarry SmithAssuming that the PETSc libraries have been successfully built for a particular 3079b92b1d3SBarry Smitharchitecture and level of optimization, a new user must merely: 3089b92b1d3SBarry Smith 3099b92b1d3SBarry Smith1. Set `PETSC_DIR` to the full path of the PETSc home 3109b92b1d3SBarry Smith directory. This will be the location of the `configure` script, and usually called 3119b92b1d3SBarry Smith "petsc" or some variation of that (for example, /home/username/petsc). 3129b92b1d3SBarry Smith2. Set `PETSC_ARCH`, which indicates the configuration on which PETSc will be 3139b92b1d3SBarry Smith used. Note that `$PETSC_ARCH` is simply a name the installer used when installing 3149b92b1d3SBarry Smith the libraries. There will exist a directory within `$PETSC_DIR` that is named after 3159b92b1d3SBarry Smith its corresponding `$PETSC_ARCH`. There many be several on a single system, for 3169b92b1d3SBarry Smith example "linux-c-debug" for the debug versions compiled by a C compiler or 3179b92b1d3SBarry Smith "linux-c-opt" for the optimized version. 3189b92b1d3SBarry Smith 3199b92b1d3SBarry Smith:::{admonition} Still Stuck? 3209b92b1d3SBarry SmithSee the {ref}`quick-start tutorial <tut_install>` for a step-by-step guide on 3219b92b1d3SBarry Smithinstalling PETSc, in case you have missed a step. 3229b92b1d3SBarry Smith 323*7f296bb3SBarry SmithSee the users manual section on {ref}`getting started <sec_getting_started>`. 3249b92b1d3SBarry Smith::: 3259b92b1d3SBarry Smith 3269b92b1d3SBarry Smith### The PETSc distribution is SO Large. How can I reduce my disk space usage? 3279b92b1d3SBarry Smith 3289b92b1d3SBarry Smith1. The PETSc users manual is provided in PDF format at `$PETSC_DIR/manual.pdf`. You 3299b92b1d3SBarry Smith can delete this. 3309b92b1d3SBarry Smith2. The PETSc test suite contains sample output for many of the examples. These are 3319b92b1d3SBarry Smith contained in the PETSc directories `$PETSC_DIR/src/*/tutorials/output` and 3329b92b1d3SBarry Smith `$PETSC_DIR/src/*/tests/output`. Once you have run the test examples, you may remove 3339b92b1d3SBarry Smith all of these directories to save some disk space. You can locate the largest with 3349b92b1d3SBarry Smith e.g. `find . -name output -type d | xargs du -sh | sort -hr` on a Unix-based system. 3359b92b1d3SBarry Smith3. The debugging versions of the libraries are larger than the optimized versions. In a 3369b92b1d3SBarry Smith pinch you can work with the optimized version, although we bid you good luck in 3379b92b1d3SBarry Smith finding bugs as it is much easier with the debug version. 3389b92b1d3SBarry Smith 3399b92b1d3SBarry Smith### I want to use PETSc only for uniprocessor programs. Must I still install and use a version of MPI? 3409b92b1d3SBarry Smith 3419b92b1d3SBarry SmithNo, run `configure` with the option `--with-mpi=0` 3429b92b1d3SBarry Smith 3439b92b1d3SBarry Smith### Can I install PETSc to not use X windows (either under Unix or Microsoft Windows with GCC)? 3449b92b1d3SBarry Smith 3459b92b1d3SBarry SmithYes. Run `configure` with the additional flag `--with-x=0` 3469b92b1d3SBarry Smith 3479b92b1d3SBarry Smith### Why do you use MPI? 3489b92b1d3SBarry Smith 3499b92b1d3SBarry SmithMPI is the message-passing standard. Because it is a standard, it will not frequently change over 3509b92b1d3SBarry Smithtime; thus, we do not have to change PETSc every time the provider of the message-passing 3519b92b1d3SBarry Smithsystem decides to make an interface change. MPI was carefully designed by experts from 3529b92b1d3SBarry Smithindustry, academia, and government labs to provide the highest quality performance and 3539b92b1d3SBarry Smithcapability. 3549b92b1d3SBarry Smith 3559b92b1d3SBarry SmithFor example, the careful design of communicators in MPI allows the easy nesting of 3569b92b1d3SBarry Smithdifferent libraries; no other message-passing system provides this support. All of the 3579b92b1d3SBarry Smithmajor parallel computer vendors were involved in the design of MPI and have committed to 3589b92b1d3SBarry Smithproviding quality implementations. 3599b92b1d3SBarry Smith 3609b92b1d3SBarry SmithIn addition, since MPI is a standard, several different groups have already provided 3619b92b1d3SBarry Smithcomplete free implementations. Thus, one does not have to rely on the technical skills of 3629b92b1d3SBarry Smithone particular group to provide the message-passing libraries. Today, MPI is the only 3639b92b1d3SBarry Smithpractical, portable approach to writing efficient parallel numerical software. 3649b92b1d3SBarry Smith 3659b92b1d3SBarry Smith(invalid_mpi_compilers)= 3669b92b1d3SBarry Smith 3679b92b1d3SBarry Smith### What do I do if my MPI compiler wrappers are invalid? 3689b92b1d3SBarry Smith 3699b92b1d3SBarry SmithMost MPI implementations provide compiler wrappers (such as `mpicc`) which give the 3709b92b1d3SBarry Smithinclude and link options necessary to use that version of MPI to the underlying compilers 3719b92b1d3SBarry Smith. Configuration will fail if these wrappers are either absent or broken in the MPI pointed to by 3729b92b1d3SBarry Smith`--with-mpi-dir`. You can rerun the configure with the additional option 3739b92b1d3SBarry Smith`--with-mpi-compilers=0`, which will try to auto-detect working compilers; however, 3749b92b1d3SBarry Smiththese compilers may be incompatible with the particular MPI build. If this fix does not 3759b92b1d3SBarry Smithwork, run with `--with-cc=[your_c_compiler]` where you know `your_c_compiler` works 3769b92b1d3SBarry Smithwith this particular MPI, and likewise for C++ (`--with-cxx=[your_cxx_compiler]`) and Fortran 3779b92b1d3SBarry Smith(`--with-fc=[your_fortran_compiler]`). 3789b92b1d3SBarry Smith 3799b92b1d3SBarry Smith(bit_indices)= 3809b92b1d3SBarry Smith 3819b92b1d3SBarry Smith### When should/can I use the `configure` option `--with-64-bit-indices`? 3829b92b1d3SBarry Smith 3839b92b1d3SBarry SmithBy default the type that PETSc uses to index into arrays and keep sizes of arrays is a 3849b92b1d3SBarry Smith`PetscInt` defined to be a 32-bit `int`. If your problem: 3859b92b1d3SBarry Smith 3869b92b1d3SBarry Smith- Involves more than 2^31 - 1 unknowns (around 2 billion). 3879b92b1d3SBarry Smith- Your matrix might contain more than 2^31 - 1 nonzeros on a single process. 3889b92b1d3SBarry Smith 3899b92b1d3SBarry SmithThen you need to use this option. Otherwise you will get strange crashes. 3909b92b1d3SBarry Smith 3919b92b1d3SBarry SmithThis option can be used when you are using either 32 or 64-bit pointers. You do not 3929b92b1d3SBarry Smithneed to use this option if you are using 64-bit pointers unless the two conditions above 3939b92b1d3SBarry Smithhold. 3949b92b1d3SBarry Smith 3959b92b1d3SBarry Smith### What if I get an internal compiler error? 3969b92b1d3SBarry Smith 3979b92b1d3SBarry SmithYou can rebuild the offending file individually with a lower optimization level. **Then 3989b92b1d3SBarry Smithmake sure to complain to the compiler vendor and file a bug report**. For example, if the 3999b92b1d3SBarry Smithcompiler chokes on `src/mat/impls/baij/seq/baijsolvtrannat.c` you can run the following 4009b92b1d3SBarry Smithfrom `$PETSC_DIR`: 4019b92b1d3SBarry Smith 4029b92b1d3SBarry Smith```console 4039b92b1d3SBarry Smith$ make -f gmakefile PCC_FLAGS="-O1" $PETSC_ARCH/obj/src/mat/impls/baij/seq/baijsolvtrannat.o 4049b92b1d3SBarry Smith$ make all 4059b92b1d3SBarry Smith``` 4069b92b1d3SBarry Smith 4079b92b1d3SBarry Smith### How do I enable Python bindings (petsc4py) with PETSc? 4089b92b1d3SBarry Smith 4099b92b1d3SBarry Smith1. Install [Cython](https://cython.org/). 4109b92b1d3SBarry Smith2. `configure` PETSc with the `--with-petsc4py=1` option. 4119b92b1d3SBarry Smith3. set `PYTHONPATH=$PETSC_DIR/$PETSC_ARCH/lib` 4129b92b1d3SBarry Smith 4139b92b1d3SBarry Smith(macos_gfortran)= 4149b92b1d3SBarry Smith 4159b92b1d3SBarry Smith### What Fortran compiler do you recommend on macOS? 4169b92b1d3SBarry Smith 4179b92b1d3SBarry SmithWe recommend using [homebrew](https://brew.sh/) to install [gfortran](https://gcc.gnu.org/wiki/GFortran), see {any}`doc_macos_install` 4189b92b1d3SBarry Smith 4199b92b1d3SBarry Smith### How can I find the URL locations of the packages you install using `--download-PACKAGE`? 4209b92b1d3SBarry Smith 4219b92b1d3SBarry Smith```console 4229b92b1d3SBarry Smith$ grep "self.download " $PETSC_DIR/config/BuildSystem/config/packages/*.py 4239b92b1d3SBarry Smith``` 4249b92b1d3SBarry Smith 4259b92b1d3SBarry Smith### How to fix the problem: PETSc was configured with one MPICH (or Open MPI) `mpi.h` version but now appears to be compiling using a different MPICH (or Open MPI) `mpi.h` version 4269b92b1d3SBarry Smith 4279b92b1d3SBarry SmithThis happens for generally one of two reasons: 4289b92b1d3SBarry Smith 4299b92b1d3SBarry Smith- You previously ran `configure` with the option `--download-mpich` (or `--download-openmpi`) 4309b92b1d3SBarry Smith but later ran `configure` to use a version of MPI already installed on the 4319b92b1d3SBarry Smith machine. Solution: 4329b92b1d3SBarry Smith 4339b92b1d3SBarry Smith ```console 4349b92b1d3SBarry Smith $ rm -rf $PETSC_DIR/$PETSC_ARCH 4359b92b1d3SBarry Smith $ ./configure --your-args 4369b92b1d3SBarry Smith ``` 4379b92b1d3SBarry Smith 4389b92b1d3SBarry Smith(mpi_network_misconfigure)= 4399b92b1d3SBarry Smith 4409b92b1d3SBarry Smith### What does it mean when `make check` hangs or errors on `PetscOptionsInsertFile()`? 4419b92b1d3SBarry Smith 4429b92b1d3SBarry SmithFor example: 4439b92b1d3SBarry Smith 4449b92b1d3SBarry Smith```none 4459b92b1d3SBarry SmithPossible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes 4469b92b1d3SBarry SmithSee https://petsc.org/release/faq/ 4479b92b1d3SBarry Smith[0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c 4489b92b1d3SBarry Smith[0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c 4499b92b1d3SBarry Smith[0]PETSC ERROR: #3 PetscInitialize() line 828 in /Users/barrysmith/Src/PETSc/src/sys/objects/pinit.c 4509b92b1d3SBarry Smith``` 4519b92b1d3SBarry Smith 4529b92b1d3SBarry Smithor 4539b92b1d3SBarry Smith 4549b92b1d3SBarry Smith```none 4559b92b1d3SBarry Smith$ make check 4569b92b1d3SBarry SmithRunning check examples to verify correct installation 4579b92b1d3SBarry SmithUsing PETSC_DIR=/Users/barrysmith/Src/petsc and PETSC_ARCH=arch-fix-mpiexec-hang-2-ranks 4589b92b1d3SBarry SmithC/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process 4599b92b1d3SBarry SmithPROGRAM SEEMS TO BE HANGING HERE 4609b92b1d3SBarry Smith``` 4619b92b1d3SBarry Smith 4629b92b1d3SBarry SmithThis usually occurs when network settings are misconfigured (perhaps due to VPN) resulting in a failure or hang in system call `gethostbyname()`. 4639b92b1d3SBarry Smith 4649b92b1d3SBarry Smith- Verify you are using the correct `mpiexec` for the MPI you have linked PETSc with. 4659b92b1d3SBarry Smith 4669b92b1d3SBarry Smith- If you have a VPN enabled on your machine, try turning it off and then running `make check` to 4679b92b1d3SBarry Smith verify that it is not the VPN playing poorly with MPI. 4689b92b1d3SBarry Smith 469*7f296bb3SBarry Smith- If ``ping `hostname` `` (`/sbin/ping` on macOS) fails or hangs do: 4709b92b1d3SBarry Smith 4719b92b1d3SBarry Smith ```none 4729b92b1d3SBarry Smith echo 127.0.0.1 `hostname` | sudo tee -a /etc/hosts 4739b92b1d3SBarry Smith ``` 4749b92b1d3SBarry Smith 4759b92b1d3SBarry Smith and try `make check` again. 4769b92b1d3SBarry Smith 4779b92b1d3SBarry Smith- Try completely disconnecting your machine from the network and see if `make check` then works 4789b92b1d3SBarry Smith 4799b92b1d3SBarry Smith- Try the PETSc `configure` option `--download-mpich-device=ch3:nemesis` with `--download-mpich`. 4809b92b1d3SBarry Smith 4819b92b1d3SBarry Smith______________________________________________________________________ 4829b92b1d3SBarry Smith 4839b92b1d3SBarry Smith## Usage 4849b92b1d3SBarry Smith 4859b92b1d3SBarry Smith### How can I redirect PETSc's `stdout` and `stderr` when programming with a GUI interface in Windows Developer Studio or to C++ streams? 4869b92b1d3SBarry Smith 4879b92b1d3SBarry SmithTo overload just the error messages write your own `MyPrintError()` function that does 4889b92b1d3SBarry Smithwhatever you want (including pop up windows etc) and use it like below. 4899b92b1d3SBarry Smith 4909b92b1d3SBarry Smith```c 4919b92b1d3SBarry Smithextern "C" { 4929b92b1d3SBarry Smith int PASCAL WinMain(HINSTANCE,HINSTANCE,LPSTR,int); 4939b92b1d3SBarry Smith}; 4949b92b1d3SBarry Smith 4959b92b1d3SBarry Smith#include <petscsys.h> 4969b92b1d3SBarry Smith#include <mpi.h> 4979b92b1d3SBarry Smith 4989b92b1d3SBarry Smithconst char help[] = "Set up from main"; 4999b92b1d3SBarry Smith 5009b92b1d3SBarry Smithint MyPrintError(const char error[], ...) 5019b92b1d3SBarry Smith{ 5029b92b1d3SBarry Smith printf("%s", error); 5039b92b1d3SBarry Smith return 0; 5049b92b1d3SBarry Smith} 5059b92b1d3SBarry Smith 5069b92b1d3SBarry Smithint main(int ac, char *av[]) 5079b92b1d3SBarry Smith{ 5089b92b1d3SBarry Smith char buf[256]; 5099b92b1d3SBarry Smith HINSTANCE inst; 5109b92b1d3SBarry Smith PetscErrorCode ierr; 5119b92b1d3SBarry Smith 5129b92b1d3SBarry Smith inst = (HINSTANCE)GetModuleHandle(NULL); 5139b92b1d3SBarry Smith PetscErrorPrintf = MyPrintError; 5149b92b1d3SBarry Smith 5159b92b1d3SBarry Smith buf[0] = 0; 5169b92b1d3SBarry Smith for (int i = 1; i < ac; ++i) { 5179b92b1d3SBarry Smith strcat(buf, av[i]); 5189b92b1d3SBarry Smith strcat(buf, " "); 5199b92b1d3SBarry Smith } 5209b92b1d3SBarry Smith 5219b92b1d3SBarry Smith PetscCall(PetscInitialize(&ac, &av, NULL, help)); 5229b92b1d3SBarry Smith 5239b92b1d3SBarry Smith return WinMain(inst, NULL, buf, SW_SHOWNORMAL); 5249b92b1d3SBarry Smith} 5259b92b1d3SBarry Smith``` 5269b92b1d3SBarry Smith 5279b92b1d3SBarry SmithPlace this file in the project and compile with this preprocessor definitions: 5289b92b1d3SBarry Smith 5299b92b1d3SBarry Smith``` 5309b92b1d3SBarry SmithWIN32 5319b92b1d3SBarry Smith_DEBUG 5329b92b1d3SBarry Smith_CONSOLE 5339b92b1d3SBarry Smith_MBCS 5349b92b1d3SBarry SmithUSE_PETSC_LOG 5359b92b1d3SBarry SmithUSE_PETSC_BOPT_g 5369b92b1d3SBarry SmithUSE_PETSC_STACK 5379b92b1d3SBarry Smith_AFXDLL 5389b92b1d3SBarry Smith``` 5399b92b1d3SBarry Smith 5409b92b1d3SBarry SmithAnd these link options: 5419b92b1d3SBarry Smith 5429b92b1d3SBarry Smith``` 5439b92b1d3SBarry Smith/nologo 5449b92b1d3SBarry Smith/subsystem:console 5459b92b1d3SBarry Smith/incremental:yes 5469b92b1d3SBarry Smith/debug 5479b92b1d3SBarry Smith/machine:I386 5489b92b1d3SBarry Smith/nodefaultlib:"libcmtd.lib" 5499b92b1d3SBarry Smith/nodefaultlib:"libcd.lib" 5509b92b1d3SBarry Smith/nodefaultlib:"mvcrt.lib" 5519b92b1d3SBarry Smith/pdbtype:sept 5529b92b1d3SBarry Smith``` 5539b92b1d3SBarry Smith 5549b92b1d3SBarry Smith:::{note} 5559b92b1d3SBarry SmithThe above is compiled and linked as if it was a console program. The linker will search 5569b92b1d3SBarry Smithfor a main, and then from it the `WinMain` will start. This works with MFC templates and 5579b92b1d3SBarry Smithderived classes too. 5589b92b1d3SBarry Smith 5599b92b1d3SBarry SmithWhen writing a Window's console application you do not need to do anything, the `stdout` 5609b92b1d3SBarry Smithand `stderr` is automatically output to the console window. 5619b92b1d3SBarry Smith::: 5629b92b1d3SBarry Smith 5639b92b1d3SBarry SmithTo change where all PETSc `stdout` and `stderr` go, (you can also reassign 5649b92b1d3SBarry Smith`PetscVFPrintf()` to handle `stdout` and `stderr` any way you like) write the 5659b92b1d3SBarry Smithfollowing function: 5669b92b1d3SBarry Smith 5679b92b1d3SBarry Smith``` 5689b92b1d3SBarry SmithPetscErrorCode mypetscvfprintf(FILE *fd, const char format[], va_list Argp) 5699b92b1d3SBarry Smith{ 5709b92b1d3SBarry Smith PetscFunctionBegin; 5719b92b1d3SBarry Smith if (fd != stdout && fd != stderr) { /* handle regular files */ 5729b92b1d3SBarry Smith PetscCall(PetscVFPrintfDefault(fd, format, Argp)); 5739b92b1d3SBarry Smith } else { 5749b92b1d3SBarry Smith char buff[1024]; /* Make sure to assign a large enough buffer */ 5759b92b1d3SBarry Smith int length; 5769b92b1d3SBarry Smith 5779b92b1d3SBarry Smith PetscCall(PetscVSNPrintf(buff, 1024, format, &length, Argp)); 5789b92b1d3SBarry Smith 5799b92b1d3SBarry Smith /* now send buff to whatever stream or whatever you want */ 5809b92b1d3SBarry Smith } 5819b92b1d3SBarry Smith PetscFunctionReturn(PETSC_SUCCESS); 5829b92b1d3SBarry Smith} 5839b92b1d3SBarry Smith``` 5849b92b1d3SBarry Smith 5859b92b1d3SBarry SmithThen assign `PetscVFPrintf = mypetscprintf` before `PetscInitialize()` in your main 5869b92b1d3SBarry Smithprogram. 5879b92b1d3SBarry Smith 5889b92b1d3SBarry Smith### I want to use Hypre boomerAMG without GMRES but when I run `-pc_type hypre -pc_hypre_type boomeramg -ksp_type preonly` I don't get a very accurate answer! 5899b92b1d3SBarry Smith 5909b92b1d3SBarry SmithYou should run with `-ksp_type richardson` to have PETSc run several V or W 5919b92b1d3SBarry Smithcycles. `-ksp_type preonly` causes boomerAMG to use only one V/W cycle. You can control 5929b92b1d3SBarry Smithhow many cycles are used in a single application of the boomerAMG preconditioner with 5939b92b1d3SBarry Smith`-pc_hypre_boomeramg_max_iter <it>` (the default is 1). You can also control the 5949b92b1d3SBarry Smithtolerance boomerAMG uses to decide if to stop before `max_iter` with 5959b92b1d3SBarry Smith`-pc_hypre_boomeramg_tol <tol>` (the default is 1.e-7). Run with `-ksp_view` to see 5969b92b1d3SBarry Smithall the hypre options used and `-help | grep boomeramg` to see all the command line 5979b92b1d3SBarry Smithoptions. 5989b92b1d3SBarry Smith 5999b92b1d3SBarry Smith### How do I use PETSc for Domain Decomposition? 6009b92b1d3SBarry Smith 6019b92b1d3SBarry SmithPETSc includes Additive Schwarz methods in the suite of preconditioners under the umbrella 6029b92b1d3SBarry Smithof `PCASM`. These may be activated with the runtime option `-pc_type asm`. Various 6039b92b1d3SBarry Smithother options may be set, including the degree of overlap `-pc_asm_overlap <number>` the 6049b92b1d3SBarry Smithtype of restriction/extension `-pc_asm_type [basic,restrict,interpolate,none]` sets ASM 6059b92b1d3SBarry Smithtype and several others. You may see the available ASM options by using `-pc_type asm 6069b92b1d3SBarry Smith-help`. See the procedural interfaces in the manual pages, for example `PCASMType()` 6079b92b1d3SBarry Smithand check the index of the users manual for `PCASMCreateSubdomains()`. 6089b92b1d3SBarry Smith 6099b92b1d3SBarry SmithPETSc also contains a domain decomposition inspired wirebasket or face based two level 6109b92b1d3SBarry Smithmethod where the coarse mesh to fine mesh interpolation is defined by solving specific 6119b92b1d3SBarry Smithlocal subdomain problems. It currently only works for 3D scalar problems on structured 6129b92b1d3SBarry Smithgrids created with PETSc `DMDA`. See the manual page for `PCEXOTIC` and 6139b92b1d3SBarry Smith`src/ksp/ksp/tutorials/ex45.c` for an example. 6149b92b1d3SBarry Smith 6159b92b1d3SBarry SmithPETSc also contains a balancing Neumann-Neumann type preconditioner, see the manual page 6169b92b1d3SBarry Smithfor `PCBDDC`. This requires matrices be constructed with `MatCreateIS()` via the finite 6179b92b1d3SBarry Smithelement method. See `src/ksp/ksp/tests/ex59.c` for an example. 6189b92b1d3SBarry Smith 6199b92b1d3SBarry Smith### You have `MATAIJ` and `MATBAIJ` matrix formats, and `MATSBAIJ` for symmetric storage, how come no `MATSAIJ`? 6209b92b1d3SBarry Smith 6219b92b1d3SBarry SmithJust for historical reasons; the `MATSBAIJ` format with blocksize one is just as efficient as 6229b92b1d3SBarry Smitha `MATSAIJ` would be. 6239b92b1d3SBarry Smith 6249b92b1d3SBarry Smith### Can I Create BAIJ matrices with different size blocks for different block rows? 6259b92b1d3SBarry Smith 6269b92b1d3SBarry SmithNo. The `MATBAIJ` format only supports a single fixed block size on the entire 6279b92b1d3SBarry Smithmatrix. But the `MATAIJ` format automatically searches for matching rows and thus still 6289b92b1d3SBarry Smithtakes advantage of the natural blocks in your matrix to obtain good performance. 6299b92b1d3SBarry Smith 6309b92b1d3SBarry Smith:::{note} 6319b92b1d3SBarry SmithIf you use `MATAIJ` you cannot use the `MatSetValuesBlocked()`. 6329b92b1d3SBarry Smith::: 6339b92b1d3SBarry Smith 6349b92b1d3SBarry Smith### How do I access the values of a remote parallel PETSc Vec? 6359b92b1d3SBarry Smith 6369b92b1d3SBarry Smith1. On each process create a local `Vec` large enough to hold all the values it wishes to 6379b92b1d3SBarry Smith access. 6389b92b1d3SBarry Smith2. Create a `VecScatter` that scatters from the parallel `Vec` into the local `Vec`. 6399b92b1d3SBarry Smith3. Use `VecGetArray()` to access the values in the local `Vec`. 6409b92b1d3SBarry Smith 6419b92b1d3SBarry SmithFor example, assuming we have distributed a vector `vecGlobal` of size $N$ to 6429b92b1d3SBarry Smith$R$ ranks and each remote rank holds $N/R = m$ values (similarly assume that 6439b92b1d3SBarry Smith$N$ is cleanly divisible by $R$). We want each rank $r$ to gather the 6449b92b1d3SBarry Smithfirst $n$ (also assume $n \leq m$) values from its immediately superior neighbor 6459b92b1d3SBarry Smith$r+1$ (final rank will retrieve from rank 0). 6469b92b1d3SBarry Smith 6479b92b1d3SBarry Smith``` 6489b92b1d3SBarry SmithVec vecLocal; 6499b92b1d3SBarry SmithIS isLocal, isGlobal; 6509b92b1d3SBarry SmithVecScatter ctx; 6519b92b1d3SBarry SmithPetscScalar *arr; 6529b92b1d3SBarry SmithPetscInt N, firstGlobalIndex; 6539b92b1d3SBarry SmithMPI_Comm comm; 6549b92b1d3SBarry SmithPetscMPIInt r, R; 6559b92b1d3SBarry Smith 6569b92b1d3SBarry Smith/* Create sequential local vector, big enough to hold local portion */ 6579b92b1d3SBarry SmithPetscCall(VecCreateSeq(PETSC_COMM_SELF, n, &vecLocal)); 6589b92b1d3SBarry Smith 6599b92b1d3SBarry Smith/* Create IS to describe where we want to scatter to */ 6609b92b1d3SBarry SmithPetscCall(ISCreateStride(PETSC_COMM_SELF, n, 0, 1, &isLocal)); 6619b92b1d3SBarry Smith 6629b92b1d3SBarry Smith/* Compute the global indices */ 6639b92b1d3SBarry SmithPetscCall(VecGetSize(vecGlobal, &N)); 6649b92b1d3SBarry SmithPetscCall(PetscObjectGetComm((PetscObject) vecGlobal, &comm)); 6659b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_rank(comm, &r)); 6669b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_size(comm, &R)); 6679b92b1d3SBarry SmithfirstGlobalIndex = r == R-1 ? 0 : (N/R)*(r+1); 6689b92b1d3SBarry Smith 6699b92b1d3SBarry Smith/* Create IS that describes where we want to scatter from */ 6709b92b1d3SBarry SmithPetscCall(ISCreateStride(comm, n, firstGlobalIndex, 1, &isGlobal)); 6719b92b1d3SBarry Smith 6729b92b1d3SBarry Smith/* Create the VecScatter context */ 6739b92b1d3SBarry SmithPetscCall(VecScatterCreate(vecGlobal, isGlobal, vecLocal, isLocal, &ctx)); 6749b92b1d3SBarry Smith 6759b92b1d3SBarry Smith/* Gather the values */ 6769b92b1d3SBarry SmithPetscCall(VecScatterBegin(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD)); 6779b92b1d3SBarry SmithPetscCall(VecScatterEnd(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD)); 6789b92b1d3SBarry Smith 6799b92b1d3SBarry Smith/* Retrieve and do work */ 6809b92b1d3SBarry SmithPetscCall(VecGetArray(vecLocal, &arr)); 6819b92b1d3SBarry Smith/* Work */ 6829b92b1d3SBarry SmithPetscCall(VecRestoreArray(vecLocal, &arr)); 6839b92b1d3SBarry Smith 6849b92b1d3SBarry Smith/* Don't forget to clean up */ 6859b92b1d3SBarry SmithPetscCall(ISDestroy(&isLocal)); 6869b92b1d3SBarry SmithPetscCall(ISDestroy(&isGlobal)); 6879b92b1d3SBarry SmithPetscCall(VecScatterDestroy(&ctx)); 6889b92b1d3SBarry SmithPetscCall(VecDestroy(&vecLocal)); 6899b92b1d3SBarry Smith``` 6909b92b1d3SBarry Smith 6919b92b1d3SBarry Smith(doc_faq_usage_alltoone)= 6929b92b1d3SBarry Smith 6939b92b1d3SBarry Smith### How do I collect to a single processor all the values from a parallel PETSc Vec? 6949b92b1d3SBarry Smith 6959b92b1d3SBarry Smith1. Create the `VecScatter` context that will do the communication: 6969b92b1d3SBarry Smith 6979b92b1d3SBarry Smith ``` 6989b92b1d3SBarry Smith Vec in_par,out_seq; 6999b92b1d3SBarry Smith VecScatter ctx; 7009b92b1d3SBarry Smith 7019b92b1d3SBarry Smith PetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq)); 7029b92b1d3SBarry Smith ``` 7039b92b1d3SBarry Smith 7049b92b1d3SBarry Smith2. Initiate the communication (this may be repeated if you wish): 7059b92b1d3SBarry Smith 7069b92b1d3SBarry Smith ``` 7079b92b1d3SBarry Smith PetscCall(VecScatterBegin(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD)); 7089b92b1d3SBarry Smith PetscCall(VecScatterEnd(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD)); 7099b92b1d3SBarry Smith /* May destroy context now if no additional scatters are needed, otherwise reuse it */ 7109b92b1d3SBarry Smith PetscCall(VecScatterDestroy(&ctx)); 7119b92b1d3SBarry Smith ``` 7129b92b1d3SBarry Smith 7139b92b1d3SBarry SmithNote that this simply concatenates in the parallel ordering of the vector (computed by the 7149b92b1d3SBarry Smith`MPI_Comm_rank` of the owning process). If you are using a `Vec` from 7159b92b1d3SBarry Smith`DMCreateGlobalVector()` you likely want to first call `DMDAGlobalToNaturalBegin()` 7169b92b1d3SBarry Smithfollowed by `DMDAGlobalToNaturalEnd()` to scatter the original `Vec` into the natural 7179b92b1d3SBarry Smithordering in a new global `Vec` before calling `VecScatterBegin()`/`VecScatterEnd()` 7189b92b1d3SBarry Smithto scatter the natural `Vec` onto all processes. 7199b92b1d3SBarry Smith 7209b92b1d3SBarry Smith### How do I collect all the values from a parallel PETSc Vec on the 0th rank? 7219b92b1d3SBarry Smith 7229b92b1d3SBarry SmithSee FAQ entry on collecting to {ref}`an arbitrary processor <doc_faq_usage_alltoone>`, but 7239b92b1d3SBarry Smithreplace 7249b92b1d3SBarry Smith 7259b92b1d3SBarry Smith``` 7269b92b1d3SBarry SmithPetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq)); 7279b92b1d3SBarry Smith``` 7289b92b1d3SBarry Smith 7299b92b1d3SBarry Smithwith 7309b92b1d3SBarry Smith 7319b92b1d3SBarry Smith``` 7329b92b1d3SBarry SmithPetscCall(VecScatterCreateToZero(in_par, &ctx, &out_seq)); 7339b92b1d3SBarry Smith``` 7349b92b1d3SBarry Smith 7359b92b1d3SBarry Smith:::{note} 7369b92b1d3SBarry SmithThe same ordering considerations as discussed in the aforementioned entry also apply 7379b92b1d3SBarry Smithhere. 7389b92b1d3SBarry Smith::: 7399b92b1d3SBarry Smith 7409b92b1d3SBarry Smith### How can I read in or write out a sparse matrix in Matrix Market, Harwell-Boeing, Slapc or other ASCII format? 7419b92b1d3SBarry Smith 7429b92b1d3SBarry SmithIf you can read or write your matrix using Python or MATLAB/Octave, `PetscBinaryIO` 7439b92b1d3SBarry Smithmodules are provided at `$PETSC_DIR/lib/petsc/bin` for each language that can assist 7449b92b1d3SBarry Smithwith reading and writing. If you just want to convert `MatrixMarket`, you can use: 7459b92b1d3SBarry Smith 7469b92b1d3SBarry Smith```console 7479b92b1d3SBarry Smith$ python -m $PETSC_DIR/lib/petsc/bin/PetscBinaryIO convert matrix.mtx 7489b92b1d3SBarry Smith``` 7499b92b1d3SBarry Smith 7509b92b1d3SBarry SmithTo produce `matrix.petsc`. 7519b92b1d3SBarry Smith 7529b92b1d3SBarry SmithYou can also call the script directly or import it from your Python code. There is also a 7539b92b1d3SBarry Smith`PETScBinaryIO.jl` Julia package. 7549b92b1d3SBarry Smith 7559b92b1d3SBarry SmithFor other formats, either adapt one of the above libraries or see the examples in 7569b92b1d3SBarry Smith`$PETSC_DIR/src/mat/tests`, specifically `ex72.c` or `ex78.c`. You will likely need 7579b92b1d3SBarry Smithto modify the code slightly to match your required ASCII format. 7589b92b1d3SBarry Smith 7599b92b1d3SBarry Smith:::{note} 7609b92b1d3SBarry SmithNever read or write in parallel an ASCII matrix file. 7619b92b1d3SBarry Smith 7629b92b1d3SBarry SmithInstead read in sequentially with a standalone code based on `ex72.c` or `ex78.c` 7639b92b1d3SBarry Smiththen save the matrix with the binary viewer `PetscViewerBinaryOpen()` and load the 7649b92b1d3SBarry Smithmatrix in parallel in your "real" PETSc program with `MatLoad()`. 7659b92b1d3SBarry Smith 7669b92b1d3SBarry SmithFor writing save with the binary viewer and then load with the sequential code to store 7679b92b1d3SBarry Smithit as ASCII. 7689b92b1d3SBarry Smith::: 7699b92b1d3SBarry Smith 7709b92b1d3SBarry Smith### Does TSSetFromOptions(), SNESSetFromOptions(), or KSPSetFromOptions() reset all the parameters I previously set or how come do they not seem to work? 7719b92b1d3SBarry Smith 7729b92b1d3SBarry SmithIf `XXSetFromOptions()` is used (with `-xxx_type aaaa`) to change the type of the 7739b92b1d3SBarry Smithobject then all parameters associated with the previous type are removed. Otherwise it 7749b92b1d3SBarry Smithdoes not reset parameters. 7759b92b1d3SBarry Smith 7769b92b1d3SBarry Smith`TS/SNES/KSPSetXXX()` commands that set properties for a particular type of object (such 7779b92b1d3SBarry Smithas `KSPGMRESSetRestart()`) ONLY work if the object is ALREADY of that type. For example, 7789b92b1d3SBarry Smithwith 7799b92b1d3SBarry Smith 7809b92b1d3SBarry Smith``` 7819b92b1d3SBarry SmithKSP ksp; 7829b92b1d3SBarry Smith 7839b92b1d3SBarry SmithPetscCall(KSPCreate(PETSC_COMM_WORLD, &ksp)); 7849b92b1d3SBarry SmithPetscCall(KSPGMRESSetRestart(ksp, 10)); 7859b92b1d3SBarry Smith``` 7869b92b1d3SBarry Smith 7879b92b1d3SBarry Smiththe restart will be ignored since the type has not yet been set to `KSPGMRES`. To have 7889b92b1d3SBarry Smiththose values take effect you should do one of the following: 7899b92b1d3SBarry Smith 7909b92b1d3SBarry Smith- Allow setting the type from the command line, if it is not on the command line then the 7919b92b1d3SBarry Smith default type is automatically set. 7929b92b1d3SBarry Smith 7939b92b1d3SBarry Smith``` 7949b92b1d3SBarry Smith/* Create generic object */ 7959b92b1d3SBarry SmithXXXCreate(..,&obj); 7969b92b1d3SBarry Smith/* Must set all settings here, or default */ 7979b92b1d3SBarry SmithXXXSetFromOptions(obj); 7989b92b1d3SBarry Smith``` 7999b92b1d3SBarry Smith 8009b92b1d3SBarry Smith- Hardwire the type in the code, but allow the user to override it via a subsequent 8019b92b1d3SBarry Smith `XXXSetFromOptions()` call. This essentially allows the user to customize what the 8029b92b1d3SBarry Smith "default" type to of the object. 8039b92b1d3SBarry Smith 8049b92b1d3SBarry Smith``` 8059b92b1d3SBarry Smith/* Create generic object */ 8069b92b1d3SBarry SmithXXXCreate(..,&obj); 8079b92b1d3SBarry Smith/* Set type directly */ 8089b92b1d3SBarry SmithXXXSetYYYYY(obj,...); 8099b92b1d3SBarry Smith/* Can always change to different type */ 8109b92b1d3SBarry SmithXXXSetFromOptions(obj); 8119b92b1d3SBarry Smith``` 8129b92b1d3SBarry Smith 8139b92b1d3SBarry Smith### How do I compile and link my own PETSc application codes and can I use my own `makefile` or rules for compiling code, rather than PETSc's? 8149b92b1d3SBarry Smith 8159b92b1d3SBarry SmithSee the {ref}`section <sec_writing_application_codes>` of the users manual on writing 8169b92b1d3SBarry Smithapplication codes with PETSc. 8179b92b1d3SBarry Smith 8189b92b1d3SBarry Smith### Can I use CMake to build my own project that depends on PETSc? 8199b92b1d3SBarry Smith 8209b92b1d3SBarry SmithSee the {ref}`section <sec_writing_application_codes>` of the users manual on writing 8219b92b1d3SBarry Smithapplication codes with PETSc. 8229b92b1d3SBarry Smith 8239b92b1d3SBarry Smith### How can I put carriage returns in `PetscPrintf()` statements from Fortran? 8249b92b1d3SBarry Smith 8259b92b1d3SBarry SmithYou can use the same notation as in C, just put a `\n` in the string. Note that no other C 8269b92b1d3SBarry Smithformat instruction is supported. 8279b92b1d3SBarry Smith 8289b92b1d3SBarry SmithOr you can use the Fortran concatenation `//` and `char(10)`; for example `'some 8299b92b1d3SBarry Smithstring'//char(10)//'another string` on the next line. 8309b92b1d3SBarry Smith 8319b92b1d3SBarry Smith### How can I implement callbacks using C++ class methods? 8329b92b1d3SBarry Smith 8339b92b1d3SBarry SmithDeclare the class method static. Static methods do not have a `this` pointer, but the 8349b92b1d3SBarry Smith`void*` context parameter will usually be cast to a pointer to the class where it can 8359b92b1d3SBarry Smithserve the same function. 8369b92b1d3SBarry Smith 8379b92b1d3SBarry Smith:::{admonition} Remember 8389b92b1d3SBarry SmithAll PETSc callbacks return `PetscErrorCode`. 8399b92b1d3SBarry Smith::: 8409b92b1d3SBarry Smith 8419b92b1d3SBarry Smith### Everyone knows that when you code Newton's Method you should compute the function and its Jacobian at the same time. How can one do this in PETSc? 8429b92b1d3SBarry Smith 8439b92b1d3SBarry SmithThe update in Newton's method is computed as 8449b92b1d3SBarry Smith 8459b92b1d3SBarry Smith$$ 8469b92b1d3SBarry Smithu^{n+1} = u^n - \lambda * \left[J(u^n)] * F(u^n) \right]^{\dagger} 8479b92b1d3SBarry Smith$$ 8489b92b1d3SBarry Smith 8499b92b1d3SBarry SmithThe reason PETSc doesn't default to computing both the function and Jacobian at the same 8509b92b1d3SBarry Smithtime is: 8519b92b1d3SBarry Smith 8529b92b1d3SBarry Smith- In order to do the line search $F \left(u^n - \lambda * \text{step} \right)$ may 8539b92b1d3SBarry Smith need to be computed for several $\lambda$. The Jacobian is not needed for each of 8549b92b1d3SBarry Smith those and one does not know in advance which will be the final $\lambda$ until 8559b92b1d3SBarry Smith after the function value is computed, so many extra Jacobians may be computed. 8569b92b1d3SBarry Smith- In the final step if $|| F(u^p)||$ satisfies the convergence criteria then a 8579b92b1d3SBarry Smith Jacobian need not be computed. 8589b92b1d3SBarry Smith 8599b92b1d3SBarry SmithYou are free to have your `FormFunction()` compute as much of the Jacobian at that point 8609b92b1d3SBarry Smithas you like, keep the information in the user context (the final argument to 8619b92b1d3SBarry Smith`FormFunction()` and `FormJacobian()`) and then retrieve the information in your 8629b92b1d3SBarry Smith`FormJacobian()` function. 8639b92b1d3SBarry Smith 8649b92b1d3SBarry Smith### Computing the Jacobian or preconditioner is time consuming. Is there any way to compute it less often? 8659b92b1d3SBarry Smith 8669b92b1d3SBarry SmithPETSc has a variety of ways of lagging the computation of the Jacobian or the 8679b92b1d3SBarry Smithpreconditioner. They are documented in the manual page for `SNESComputeJacobian()` 8689b92b1d3SBarry Smithand in the {ref}`users manual <ch_snes>`: 8699b92b1d3SBarry Smith 8709b92b1d3SBarry Smith-s 8719b92b1d3SBarry Smith 8729b92b1d3SBarry Smithnes_lag_jacobian 8739b92b1d3SBarry Smith 8749b92b1d3SBarry Smith(`SNESSetLagJacobian()`) How often Jacobian is rebuilt (use -1 to 8759b92b1d3SBarry Smithnever rebuild, use -2 to rebuild the next time requested and then 8769b92b1d3SBarry Smithnever again). 8779b92b1d3SBarry Smith 8789b92b1d3SBarry Smith-s 8799b92b1d3SBarry Smith 8809b92b1d3SBarry Smithnes_lag_jacobian_persists 8819b92b1d3SBarry Smith 8829b92b1d3SBarry Smith(`SNESSetLagJacobianPersists()`) Forces lagging of Jacobian 8839b92b1d3SBarry Smiththrough multiple `SNES` solves , same as passing -2 to 8849b92b1d3SBarry Smith`-snes_lag_jacobian`. By default, each new `SNES` solve 8859b92b1d3SBarry Smithnormally triggers a recomputation of the Jacobian. 8869b92b1d3SBarry Smith 8879b92b1d3SBarry Smith-s 8889b92b1d3SBarry Smith 8899b92b1d3SBarry Smithnes_lag_preconditioner 8909b92b1d3SBarry Smith 8919b92b1d3SBarry Smith(`SNESSetLagPreconditioner()`) how often the preconditioner is 8929b92b1d3SBarry Smithrebuilt. Note: if you are lagging the Jacobian the system will 8939b92b1d3SBarry Smithknow that the matrix has not changed and will not recompute the 8949b92b1d3SBarry Smith(same) preconditioner. 8959b92b1d3SBarry Smith 8969b92b1d3SBarry Smith-s 8979b92b1d3SBarry Smith 8989b92b1d3SBarry Smithnes_lag_preconditioner_persists 8999b92b1d3SBarry Smith 9009b92b1d3SBarry Smith(`SNESSetLagPreconditionerPersists()`) Preconditioner 9019b92b1d3SBarry Smithlags through multiple `SNES` solves 9029b92b1d3SBarry Smith 9039b92b1d3SBarry Smith:::{note} 9049b92b1d3SBarry SmithThese are often (but does not need to be) used in combination with 9059b92b1d3SBarry Smith`-snes_mf_operator` which applies the fresh Jacobian matrix-free for every 9069b92b1d3SBarry Smithmatrix-vector product. Otherwise the out-of-date matrix vector product, computed with 9079b92b1d3SBarry Smiththe lagged Jacobian will be used. 9089b92b1d3SBarry Smith::: 9099b92b1d3SBarry Smith 9109b92b1d3SBarry SmithBy using `KSPMonitorSet()` and/or `SNESMonitorSet()` one can provide code that monitors the 9119b92b1d3SBarry Smithconvergence rate and automatically triggers an update of the Jacobian or preconditioner 9129b92b1d3SBarry Smithbased on decreasing convergence of the iterative method. For example if the number of `SNES` 9139b92b1d3SBarry Smithiterations doubles one might trigger a new computation of the Jacobian. Experimentation is 9149b92b1d3SBarry Smiththe only general purpose way to determine which approach is best for your problem. 9159b92b1d3SBarry Smith 9169b92b1d3SBarry Smith:::{important} 9179b92b1d3SBarry SmithIt is also vital to experiment on your true problem at the scale you will be solving 9189b92b1d3SBarry Smiththe problem since the performance benefits depend on the exact problem and the problem 9199b92b1d3SBarry Smithsize! 9209b92b1d3SBarry Smith::: 9219b92b1d3SBarry Smith 9229b92b1d3SBarry Smith### How can I use Newton's Method Jacobian free? Can I difference a different function than provided with `SNESSetFunction()`? 9239b92b1d3SBarry Smith 9249b92b1d3SBarry SmithThe simplest way is with the option `-snes_mf`, this will use finite differencing of the 9259b92b1d3SBarry Smithfunction provided to `SNESComputeFunction()` to approximate the action of Jacobian. 9269b92b1d3SBarry Smith 9279b92b1d3SBarry Smith:::{important} 9289b92b1d3SBarry SmithSince no matrix-representation of the Jacobian is provided the `-pc_type` used with 9299b92b1d3SBarry Smiththis option must be `-pc_type none`. You may provide a custom preconditioner with 9309b92b1d3SBarry Smith`SNESGetKSP()`, `KSPGetPC()`, and `PCSetType()` and use `PCSHELL`. 9319b92b1d3SBarry Smith::: 9329b92b1d3SBarry Smith 9339b92b1d3SBarry SmithThe option `-snes_mf_operator` will use Jacobian free to apply the Jacobian (in the 9349b92b1d3SBarry SmithKrylov solvers) but will use whatever matrix you provided with `SNESSetJacobian()` 9359b92b1d3SBarry Smith(assuming you set one) to compute the preconditioner. 9369b92b1d3SBarry Smith 9379b92b1d3SBarry SmithTo write the code (rather than use the options above) use `MatCreateSNESMF()` and pass 9389b92b1d3SBarry Smiththe resulting matrix object to `SNESSetJacobian()`. 9399b92b1d3SBarry Smith 9409b92b1d3SBarry SmithFor purely matrix-free (like `-snes_mf`) pass the matrix object for both matrix 9419b92b1d3SBarry Smitharguments and pass the function `MatMFFDComputeJacobian()`. 9429b92b1d3SBarry Smith 9439b92b1d3SBarry SmithTo provide your own approximate Jacobian matrix to compute the preconditioner (like 9449b92b1d3SBarry Smith`-snes_mf_operator`), pass this other matrix as the second matrix argument to 9459b92b1d3SBarry Smith`SNESSetJacobian()`. Make sure your provided `computejacobian()` function calls 9469b92b1d3SBarry Smith`MatAssemblyBegin()` and `MatAssemblyEnd()` separately on **BOTH** matrix arguments 9479b92b1d3SBarry Smithto this function. See `src/snes/tests/ex7.c`. 9489b92b1d3SBarry Smith 9499b92b1d3SBarry SmithTo difference a different function than that passed to `SNESSetJacobian()` to compute the 9509b92b1d3SBarry Smithmatrix-free Jacobian multiply call `MatMFFDSetFunction()` to set that other function. See 9519b92b1d3SBarry Smith`src/snes/tests/ex7.c.h`. 9529b92b1d3SBarry Smith 9539b92b1d3SBarry Smith(doc_faq_usage_condnum)= 9549b92b1d3SBarry Smith 9559b92b1d3SBarry Smith### How can I determine the condition number of a matrix? 9569b92b1d3SBarry Smith 9579b92b1d3SBarry SmithFor small matrices, the condition number can be reliably computed using 9589b92b1d3SBarry Smith 9599b92b1d3SBarry Smith```text 9609b92b1d3SBarry Smith-pc_type svd -pc_svd_monitor 9619b92b1d3SBarry Smith``` 9629b92b1d3SBarry Smith 9639b92b1d3SBarry SmithFor larger matrices, you can run with 9649b92b1d3SBarry Smith 9659b92b1d3SBarry Smith```text 9669b92b1d3SBarry Smith-pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000 9679b92b1d3SBarry Smith``` 9689b92b1d3SBarry Smith 9699b92b1d3SBarry Smithto get approximations to the condition number of the operator. This will generally be 9709b92b1d3SBarry Smithaccurate for the largest singular values, but may overestimate the smallest singular value 9719b92b1d3SBarry Smithunless the method has converged. Make sure to avoid restarts. To estimate the condition 9729b92b1d3SBarry Smithnumber of the preconditioned operator, use `-pc_type somepc` in the last command. 9739b92b1d3SBarry Smith 9749b92b1d3SBarry SmithYou can use [SLEPc](https://slepc.upv.es) for highly scalable, efficient, and quality eigenvalue computations. 9759b92b1d3SBarry Smith 9769b92b1d3SBarry Smith### How can I compute the inverse of a matrix in PETSc? 9779b92b1d3SBarry Smith 9789b92b1d3SBarry Smith:::{admonition} Are you sure? 9799b92b1d3SBarry Smith:class: yellow 9809b92b1d3SBarry Smith 9819b92b1d3SBarry SmithIt is very expensive to compute the inverse of a matrix and very rarely needed in 9829b92b1d3SBarry Smithpractice. We highly recommend avoiding algorithms that need it. 9839b92b1d3SBarry Smith::: 9849b92b1d3SBarry Smith 9859b92b1d3SBarry SmithThe inverse of a matrix (dense or sparse) is essentially always dense, so begin by 9869b92b1d3SBarry Smithcreating a dense matrix B and fill it with the identity matrix (ones along the diagonal), 9879b92b1d3SBarry Smithalso create a dense matrix X of the same size that will hold the solution. Then factor the 9889b92b1d3SBarry Smithmatrix you wish to invert with `MatLUFactor()` or `MatCholeskyFactor()`, call the 9899b92b1d3SBarry Smithresult A. Then call `MatMatSolve(A,B,X)` to compute the inverse into X. See also section 9909b92b1d3SBarry Smithon {any}`Schur's complement <how_can_i_compute_the_schur_complement>`. 9919b92b1d3SBarry Smith 9929b92b1d3SBarry Smith(how_can_i_compute_the_schur_complement)= 9939b92b1d3SBarry Smith 9949b92b1d3SBarry Smith### How can I compute the Schur complement in PETSc? 9959b92b1d3SBarry Smith 9969b92b1d3SBarry Smith:::{admonition} Are you sure? 9979b92b1d3SBarry Smith:class: yellow 9989b92b1d3SBarry Smith 9999b92b1d3SBarry SmithIt is very expensive to compute the Schur complement of a matrix and very rarely needed 10009b92b1d3SBarry Smithin practice. We highly recommend avoiding algorithms that need it. 10019b92b1d3SBarry Smith::: 10029b92b1d3SBarry Smith 10039b92b1d3SBarry SmithThe Schur complement of the matrix $M \in \mathbb{R}^{\left(p+q \right) \times 10049b92b1d3SBarry Smith\left(p + q \right)}$ 10059b92b1d3SBarry Smith 10069b92b1d3SBarry Smith$$ 10079b92b1d3SBarry SmithM = \begin{bmatrix} 10089b92b1d3SBarry SmithA & B \\ 10099b92b1d3SBarry SmithC & D 10109b92b1d3SBarry Smith\end{bmatrix} 10119b92b1d3SBarry Smith$$ 10129b92b1d3SBarry Smith 10139b92b1d3SBarry Smithwhere 10149b92b1d3SBarry Smith 10159b92b1d3SBarry Smith$$ 10169b92b1d3SBarry SmithA \in \mathbb{R}^{p \times p}, \quad B \in \mathbb{R}^{p \times q}, \quad C \in \mathbb{R}^{q \times p}, \quad D \in \mathbb{R}^{q \times q} 10179b92b1d3SBarry Smith$$ 10189b92b1d3SBarry Smith 10199b92b1d3SBarry Smithis given by 10209b92b1d3SBarry Smith 10219b92b1d3SBarry Smith$$ 10229b92b1d3SBarry SmithS_D := A - BD^{-1}C \\ 10239b92b1d3SBarry SmithS_A := D - CA^{-1}B 10249b92b1d3SBarry Smith$$ 10259b92b1d3SBarry Smith 10269b92b1d3SBarry SmithLike the inverse, the Schur complement of a matrix (dense or sparse) is essentially always 10279b92b1d3SBarry Smithdense, so assuming you wish to calculate $S_A = D - C \underbrace{ 10289b92b1d3SBarry Smith\overbrace{(A^{-1})}^{U} B}_{V}$ begin by: 10299b92b1d3SBarry Smith 10309b92b1d3SBarry Smith1. Forming a dense matrix $B$ 10319b92b1d3SBarry Smith2. Also create another dense matrix $V$ of the same size. 10329b92b1d3SBarry Smith3. Then either factor the matrix $A$ directly with `MatLUFactor()` or 10339b92b1d3SBarry Smith `MatCholeskyFactor()`, or use `MatGetFactor()` followed by 10349b92b1d3SBarry Smith `MatLUFactorSymbolic()` followed by `MatLUFactorNumeric()` if you wish to use and 10359b92b1d3SBarry Smith external solver package like SuperLU_Dist. Call the result $U$. 10369b92b1d3SBarry Smith4. Then call `MatMatSolve(U,B,V)`. 10379b92b1d3SBarry Smith5. Then call `MatMatMult(C,V,MAT_INITIAL_MATRIX,1.0,&S)`. 10389b92b1d3SBarry Smith6. Now call `MatAXPY(S,-1.0,D,MAT_SUBSET_NONZERO)`. 10399b92b1d3SBarry Smith7. Followed by `MatScale(S,-1.0)`. 10409b92b1d3SBarry Smith 10419b92b1d3SBarry SmithFor computing Schur complements like this it does not make sense to use the `KSP` 10429b92b1d3SBarry Smithiterative solvers since for solving many moderate size problems using a direct 10439b92b1d3SBarry Smithfactorization is much faster than iterative solvers. As you can see, this requires a great 10449b92b1d3SBarry Smithdeal of work space and computation so is best avoided. 10459b92b1d3SBarry Smith 10469b92b1d3SBarry SmithHowever, it is not necessary to assemble the Schur complement $S$ in order to solve 10479b92b1d3SBarry Smithsystems with it. Use `MatCreateSchurComplement(A,A_pre,B,C,D,&S)` to create a 10489b92b1d3SBarry Smithmatrix that applies the action of $S$ (using `A_pre` to solve with `A`), but 10499b92b1d3SBarry Smithdoes not assemble. 10509b92b1d3SBarry Smith 10519b92b1d3SBarry SmithAlternatively, if you already have a block matrix `M = [A, B; C, D]` (in some 10529b92b1d3SBarry Smithordering), then you can create index sets (`IS`) `isa` and `isb` to address each 10539b92b1d3SBarry Smithblock, then use `MatGetSchurComplement()` to create the Schur complement and/or an 10549b92b1d3SBarry Smithapproximation suitable for preconditioning. 10559b92b1d3SBarry Smith 10569b92b1d3SBarry SmithSince $S$ is generally dense, standard preconditioning methods cannot typically be 10579b92b1d3SBarry Smithapplied directly to Schur complements. There are many approaches to preconditioning Schur 10589b92b1d3SBarry Smithcomplements including using the `SIMPLE` approximation 10599b92b1d3SBarry Smith 10609b92b1d3SBarry Smith$$ 10619b92b1d3SBarry SmithD - C \text{diag}(A)^{-1} B 10629b92b1d3SBarry Smith$$ 10639b92b1d3SBarry Smith 10649b92b1d3SBarry Smithto create a sparse matrix that approximates the Schur complement (this is returned by 10659b92b1d3SBarry Smithdefault for the optional "preconditioning" matrix in `MatGetSchurComplement()`). 10669b92b1d3SBarry Smith 10679b92b1d3SBarry SmithAn alternative is to interpret the matrices as differential operators and apply 10689b92b1d3SBarry Smithapproximate commutator arguments to find a spectrally equivalent operation that can be 10699b92b1d3SBarry Smithapplied efficiently (see the "PCD" preconditioners {cite}`elman_silvester_wathen_2014`). A 10709b92b1d3SBarry Smithvariant of this is the least squares commutator, which is closely related to the 10719b92b1d3SBarry SmithMoore-Penrose pseudoinverse, and is available in `PCLSC` which operates on matrices of 10729b92b1d3SBarry Smithtype `MATSCHURCOMPLEMENT`. 10739b92b1d3SBarry Smith 10749b92b1d3SBarry Smith### Do you have examples of doing unstructured grid Finite Element Method (FEM) with PETSc? 10759b92b1d3SBarry Smith 10769b92b1d3SBarry SmithThere are at least three ways to write finite element codes using PETSc: 10779b92b1d3SBarry Smith 10789b92b1d3SBarry Smith1. Use `DMPLEX`, which is a high level approach to manage your mesh and 10799b92b1d3SBarry Smith discretization. See the {ref}`tutorials sections <tut_stokes>` for further information, 10809b92b1d3SBarry Smith or see `src/snes/tutorial/ex62.c`. 10819b92b1d3SBarry Smith2. Use packages such as [deal.ii](https://www.dealii.org), [libMesh](https://libmesh.github.io), or 10829b92b1d3SBarry Smith [Firedrake](https://www.firedrakeproject.org), which use PETSc for the solvers. 10839b92b1d3SBarry Smith3. Manage the grid data structure yourself and use PETSc `PetscSF`, `IS`, and `VecScatter` to 10849b92b1d3SBarry Smith communicate the required ghost point communication. See 10859b92b1d3SBarry Smith `src/snes/tutorials/ex10d/ex10.c`. 10869b92b1d3SBarry Smith 10879b92b1d3SBarry Smith### DMDA decomposes the domain differently than the MPI_Cart_create() command. How can one use them together? 10889b92b1d3SBarry Smith 10899b92b1d3SBarry SmithThe `MPI_Cart_create()` first divides the mesh along the z direction, then the y, then 10909b92b1d3SBarry Smiththe x. `DMDA` divides along the x, then y, then z. Thus, for example, rank 1 of the 10919b92b1d3SBarry Smithprocesses will be in a different part of the mesh for the two schemes. To resolve this you 10929b92b1d3SBarry Smithcan create a new MPI communicator that you pass to `DMDACreate()` that renumbers the 10939b92b1d3SBarry Smithprocess ranks so that each physical process shares the same part of the mesh with both the 10949b92b1d3SBarry Smith`DMDA` and the `MPI_Cart_create()`. The code to determine the new numbering was 10959b92b1d3SBarry Smithprovided by Rolf Kuiper: 10969b92b1d3SBarry Smith 10979b92b1d3SBarry Smith``` 10989b92b1d3SBarry Smith// the numbers of processors per direction are (int) x_procs, y_procs, z_procs respectively 10999b92b1d3SBarry Smith// (no parallelization in direction 'dir' means dir_procs = 1) 11009b92b1d3SBarry Smith 11019b92b1d3SBarry SmithMPI_Comm NewComm; 11029b92b1d3SBarry Smithint x, y, z; 11039b92b1d3SBarry SmithPetscMPIInt MPI_Rank, NewRank; 11049b92b1d3SBarry Smith 11059b92b1d3SBarry Smith// get rank from MPI ordering: 11069b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &MPI_Rank)); 11079b92b1d3SBarry Smith 11089b92b1d3SBarry Smith// calculate coordinates of cpus in MPI ordering: 11099b92b1d3SBarry Smithx = MPI_rank / (z_procs*y_procs); 11109b92b1d3SBarry Smithy = (MPI_rank % (z_procs*y_procs)) / z_procs; 11119b92b1d3SBarry Smithz = (MPI_rank % (z_procs*y_procs)) % z_procs; 11129b92b1d3SBarry Smith 11139b92b1d3SBarry Smith// set new rank according to PETSc ordering: 11149b92b1d3SBarry SmithNewRank = z*y_procs*x_procs + y*x_procs + x; 11159b92b1d3SBarry Smith 11169b92b1d3SBarry Smith// create communicator with new ranks according to PETSc ordering 11179b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &NewComm)); 11189b92b1d3SBarry Smith 11199b92b1d3SBarry Smith// override the default communicator (was MPI_COMM_WORLD as default) 11209b92b1d3SBarry SmithPETSC_COMM_WORLD = NewComm; 11219b92b1d3SBarry Smith``` 11229b92b1d3SBarry Smith 11239b92b1d3SBarry Smith### When solving a system with Dirichlet boundary conditions I can use MatZeroRows() to eliminate the Dirichlet rows but this results in a non-Symmetric system. How can I apply Dirichlet boundary conditions but keep the matrix symmetric? 11249b92b1d3SBarry Smith 11259b92b1d3SBarry Smith- For nonsymmetric systems put the appropriate boundary solutions in the x vector and use 11269b92b1d3SBarry Smith `MatZeroRows()` followed by `KSPSetOperators()`. 11279b92b1d3SBarry Smith- For symmetric problems use `MatZeroRowsColumns()`. 11289b92b1d3SBarry Smith- If you have many Dirichlet locations you can use `MatZeroRows()` (**not** 11299b92b1d3SBarry Smith `MatZeroRowsColumns()`) and `-ksp_type preonly -pc_type redistribute` (see 11309b92b1d3SBarry Smith `PCREDISTRIBUTE`) and PETSc will repartition the parallel matrix for load 11319b92b1d3SBarry Smith balancing. In this case the new matrix solved remains symmetric even though 11329b92b1d3SBarry Smith `MatZeroRows()` is used. 11339b92b1d3SBarry Smith 11349b92b1d3SBarry SmithAn alternative approach is, when assembling the matrix (generating values and passing 11359b92b1d3SBarry Smiththem to the matrix), never to include locations for the Dirichlet grid points in the 11369b92b1d3SBarry Smithvector and matrix, instead taking them into account as you put the other values into the 11379b92b1d3SBarry Smithload. 11389b92b1d3SBarry Smith 11399b92b1d3SBarry Smith### How can I get PETSc vectors and matrices to MATLAB or vice versa? 11409b92b1d3SBarry Smith 11419b92b1d3SBarry SmithThere are numerous ways to work with PETSc and MATLAB. All but the first approach 11429b92b1d3SBarry Smithrequire PETSc to be configured with --with-matlab. 11439b92b1d3SBarry Smith 11449b92b1d3SBarry Smith1. To save PETSc `Mat` and `Vec` to files that can be read from MATLAB use 11459b92b1d3SBarry Smith `PetscViewerBinaryOpen()` viewer and `VecView()` or `MatView()` to save objects 11469b92b1d3SBarry Smith for MATLAB and `VecLoad()` and `MatLoad()` to get the objects that MATLAB has 11479b92b1d3SBarry Smith saved. See `share/petsc/matlab/PetscBinaryRead.m` and 11489b92b1d3SBarry Smith `share/petsc/matlab/PetscBinaryWrite.m` for loading and saving the objects in MATLAB. 11499b92b1d3SBarry Smith2. Using the [MATLAB Engine](https://www.mathworks.com/help/matlab/calling-matlab-engine-from-c-programs-1.html), 11509b92b1d3SBarry Smith allows PETSc to automatically call MATLAB to perform some specific computations. This 11519b92b1d3SBarry Smith does not allow MATLAB to be used interactively by the user. See the 11529b92b1d3SBarry Smith `PetscMatlabEngine`. 11539b92b1d3SBarry Smith3. You can open a socket connection between MATLAB and PETSc to allow sending objects back 11549b92b1d3SBarry Smith and forth between an interactive MATLAB session and a running PETSc program. See 11559b92b1d3SBarry Smith `PetscViewerSocketOpen()` for access from the PETSc side and 11569b92b1d3SBarry Smith `share/petsc/matlab/PetscReadBinary.m` for access from the MATLAB side. 11579b92b1d3SBarry Smith4. You can save PETSc `Vec` (**not** `Mat`) with the `PetscViewerMatlabOpen()` 11589b92b1d3SBarry Smith viewer that saves `.mat` files can then be loaded into MATLAB using the `load()` command 11599b92b1d3SBarry Smith 11609b92b1d3SBarry Smith### How do I get started with Cython so that I can extend petsc4py? 11619b92b1d3SBarry Smith 11629b92b1d3SBarry Smith1. Learn how to [build a Cython module](http://docs.cython.org/src/quickstart/build.html). 11639b92b1d3SBarry Smith2. Go through the simple [example](https://stackoverflow.com/questions/3046305/simple-wrapping-of-c-code-with-cython). Note 11649b92b1d3SBarry Smith also the next comment that shows how to create numpy arrays in the Cython and pass them 11659b92b1d3SBarry Smith back. 11669b92b1d3SBarry Smith3. Check out [this page](http://docs.cython.org/src/tutorial/numpy.html) which tells 11679b92b1d3SBarry Smith you how to get fast indexing. 11689b92b1d3SBarry Smith4. Have a look at the petsc4py [array source](http://code.google.com/p/petsc4py/source/browse/src/PETSc/arraynpy.pxi). 11699b92b1d3SBarry Smith 11709b92b1d3SBarry Smith### How do I compute a custom norm for KSP to use as a convergence test or for monitoring? 11719b92b1d3SBarry Smith 11729b92b1d3SBarry SmithYou need to call `KSPBuildResidual()` on your `KSP` object and then compute the 11739b92b1d3SBarry Smithappropriate norm on the resulting residual. Note that depending on the 11749b92b1d3SBarry Smith`KSPSetNormType()` of the method you may not return the same norm as provided by the 11759b92b1d3SBarry Smithmethod. See also `KSPSetPCSide()`. 11769b92b1d3SBarry Smith 11779b92b1d3SBarry Smith### If I have a sequential program can I use a PETSc parallel solver? 11789b92b1d3SBarry Smith 11799b92b1d3SBarry Smith:::{important} 11809b92b1d3SBarry SmithDo not expect to get great speedups! Much of the speedup gained by using parallel 11819b92b1d3SBarry Smithsolvers comes from building the underlying matrices and vectors in parallel to begin 11829b92b1d3SBarry Smithwith. You should see some reduction in the time for the linear solvers. 11839b92b1d3SBarry Smith::: 11849b92b1d3SBarry Smith 11859b92b1d3SBarry SmithYes, you must set up PETSc with MPI (even though you will not use MPI) with at least the 11869b92b1d3SBarry Smithfollowing options: 11879b92b1d3SBarry Smith 11889b92b1d3SBarry Smith```console 11899b92b1d3SBarry Smith$ ./configure --download-superlu_dist --download-parmetis --download-metis --with-openmp 11909b92b1d3SBarry Smith``` 11919b92b1d3SBarry Smith 11929b92b1d3SBarry SmithYour compiler must support OpenMP. To have the linear solver run in parallel run your 11939b92b1d3SBarry Smithprogram with 11949b92b1d3SBarry Smith 11959b92b1d3SBarry Smith```console 11969b92b1d3SBarry Smith$ OMP_NUM_THREADS=n ./myprog -pc_type lu -pc_factor_mat_solver superlu_dist 11979b92b1d3SBarry Smith``` 11989b92b1d3SBarry Smith 11999b92b1d3SBarry Smithwhere `n` is the number of threads and should be less than or equal to the number of cores 12009b92b1d3SBarry Smithavailable. 12019b92b1d3SBarry Smith 12029b92b1d3SBarry Smith:::{note} 12039b92b1d3SBarry SmithIf your code is MPI parallel you can also use these same options to have SuperLU_dist 12049b92b1d3SBarry Smithutilize multiple threads per MPI process for the direct solver. Make sure that the 12059b92b1d3SBarry Smith`$OMP_NUM_THREADS` you use per MPI process is less than or equal to the number of 12069b92b1d3SBarry Smithcores available for each MPI process. For example if your compute nodes have 6 cores 12079b92b1d3SBarry Smithand you use 2 MPI processes per node then set `$OMP_NUM_THREADS` to 2 or 3. 12089b92b1d3SBarry Smith::: 12099b92b1d3SBarry Smith 12109b92b1d3SBarry SmithAnother approach that allows using a PETSc parallel solver is to use `PCMPI`. 12119b92b1d3SBarry Smith 12129b92b1d3SBarry Smith### TS or SNES produces infeasible (out of domain) solutions or states. How can I prevent this? 12139b92b1d3SBarry Smith 12149b92b1d3SBarry SmithFor `TS` call `TSSetFunctionDomainError()`. For both `TS` and `SNES` call 12159b92b1d3SBarry Smith`SNESSetFunctionDomainError()` when the solver passes an infeasible (out of domain) 12169b92b1d3SBarry Smithsolution or state to your routines. 12179b92b1d3SBarry Smith 12189b92b1d3SBarry SmithIf it occurs for DAEs, it is important to insure the algebraic constraints are well 12199b92b1d3SBarry Smithsatisfied, which can prevent "breakdown" later. Thus, one can try using a tight tolerance 12209b92b1d3SBarry Smithfor `SNES`, using a direct linear solver (`PCType` of `PCLU`) when possible, and reducing the timestep (or 12219b92b1d3SBarry Smithtightening `TS` tolerances for adaptive time stepping). 12229b92b1d3SBarry Smith 12239b92b1d3SBarry Smith### Can PETSc work with Hermitian matrices? 12249b92b1d3SBarry Smith 12259b92b1d3SBarry SmithPETSc's support of Hermitian matrices is limited. Many operations and solvers work 12269b92b1d3SBarry Smithfor symmetric (`MATSBAIJ`) matrices and operations on transpose matrices but there is 12279b92b1d3SBarry Smithlittle direct support for Hermitian matrices and Hermitian transpose (complex conjugate 12289b92b1d3SBarry Smithtranspose) operations. There is `KSPSolveTranspose()` for solving the transpose of a 12299b92b1d3SBarry Smithlinear system but no `KSPSolveHermitian()`. 12309b92b1d3SBarry Smith 12319b92b1d3SBarry SmithFor creating known Hermitian matrices: 12329b92b1d3SBarry Smith 12339b92b1d3SBarry Smith- `MatCreateNormalHermitian()` 12349b92b1d3SBarry Smith- `MatCreateHermitianTranspose()` 12359b92b1d3SBarry Smith 12369b92b1d3SBarry SmithFor determining or setting Hermitian status on existing matrices: 12379b92b1d3SBarry Smith 12389b92b1d3SBarry Smith- `MatIsHermitian()` 12399b92b1d3SBarry Smith- `MatIsHermitianKnown()` 12409b92b1d3SBarry Smith- `MatIsStructurallySymmetric()` 12419b92b1d3SBarry Smith- `MatIsSymmetricKnown()` 12429b92b1d3SBarry Smith- `MatIsSymmetric()` 12439b92b1d3SBarry Smith- `MatSetOption()` (use with `MAT_SYMMETRIC` or `MAT_HERMITIAN` to assert to PETSc 12449b92b1d3SBarry Smith that either is the case). 12459b92b1d3SBarry Smith 12469b92b1d3SBarry SmithFor performing matrix operations on known Hermitian matrices (note that regular `Mat` 12479b92b1d3SBarry Smithfunctions such as `MatMult()` will of course also work on Hermitian matrices): 12489b92b1d3SBarry Smith 12499b92b1d3SBarry Smith- `MatMultHermitianTranspose()` 12509b92b1d3SBarry Smith- `MatMultHermitianTransposeAdd()` (very limited support) 12519b92b1d3SBarry Smith 12529b92b1d3SBarry Smith### How can I assemble a bunch of similar matrices? 12539b92b1d3SBarry Smith 12549b92b1d3SBarry SmithYou can first add the values common to all the matrices, then use `MatStoreValues()` to 12559b92b1d3SBarry Smithstash the common values. Each iteration you call `MatRetrieveValues()`, then set the 12569b92b1d3SBarry Smithunique values in a matrix and assemble. 12579b92b1d3SBarry Smith 12589b92b1d3SBarry Smith### Can one resize or change the size of PETSc matrices or vectors? 12599b92b1d3SBarry Smith 12609b92b1d3SBarry SmithNo, once the vector or matrices sizes have been set and the matrices or vectors are fully 12619b92b1d3SBarry Smithusable one cannot change the size of the matrices or vectors or number of processors they 12629b92b1d3SBarry Smithlive on. One may create new vectors and copy, for example using `VecScatterCreate()`, 12639b92b1d3SBarry Smiththe old values from the previous vector. 12649b92b1d3SBarry Smith 12659b92b1d3SBarry Smith### How can one compute the nullspace of a sparse matrix with MUMPS? 12669b92b1d3SBarry Smith 12679b92b1d3SBarry SmithAssuming you have an existing matrix $A$ whose nullspace $V$ you want to find: 12689b92b1d3SBarry Smith 12699b92b1d3SBarry Smith``` 12709b92b1d3SBarry SmithMat F, work, V; 12719b92b1d3SBarry SmithPetscInt N, rows; 12729b92b1d3SBarry Smith 12739b92b1d3SBarry Smith/* Determine factorability */ 12749b92b1d3SBarry SmithPetscCall(MatGetFactor(A, MATSOLVERMUMPS, MAT_FACTOR_LU, &F)); 12759b92b1d3SBarry SmithPetscCall(MatGetLocalSize(A, &rows, NULL)); 12769b92b1d3SBarry Smith 12779b92b1d3SBarry Smith/* Set MUMPS options, see MUMPS documentation for more information */ 12789b92b1d3SBarry SmithPetscCall(MatMumpsSetIcntl(F, 24, 1)); 12799b92b1d3SBarry SmithPetscCall(MatMumpsSetIcntl(F, 25, 1)); 12809b92b1d3SBarry Smith 12819b92b1d3SBarry Smith/* Perform factorization */ 12829b92b1d3SBarry SmithPetscCall(MatLUFactorSymbolic(F, A, NULL, NULL, NULL)); 12839b92b1d3SBarry SmithPetscCall(MatLUFactorNumeric(F, A, NULL)); 12849b92b1d3SBarry Smith 12859b92b1d3SBarry Smith/* This is the dimension of the null space */ 12869b92b1d3SBarry SmithPetscCall(MatMumpsGetInfog(F, 28, &N)); 12879b92b1d3SBarry Smith 12889b92b1d3SBarry Smith/* This will contain the null space in the columns */ 12899b92b1d3SBarry SmithPetscCall(MatCreateDense(comm, rows, N, PETSC_DETERMINE, PETSC_DETERMINE, NULL, &V)); 12909b92b1d3SBarry Smith 12919b92b1d3SBarry SmithPetscCall(MatDuplicate(V, MAT_DO_NOT_COPY_VALUES, &work)); 12929b92b1d3SBarry SmithPetscCall(MatMatSolve(F, work, V)); 12939b92b1d3SBarry Smith``` 12949b92b1d3SBarry Smith 12959b92b1d3SBarry Smith______________________________________________________________________ 12969b92b1d3SBarry Smith 12979b92b1d3SBarry Smith## Execution 12989b92b1d3SBarry Smith 12999b92b1d3SBarry Smith### PETSc executables are SO big and take SO long to link 13009b92b1d3SBarry Smith 13019b92b1d3SBarry Smith:::{note} 13029b92b1d3SBarry SmithSee {ref}`shared libraries section <doc_faq_sharedlibs>` for more information. 13039b92b1d3SBarry Smith::: 13049b92b1d3SBarry Smith 13059b92b1d3SBarry SmithWe find this annoying as well. On most machines PETSc can use shared libraries, so 13069b92b1d3SBarry Smithexecutables should be much smaller, run `configure` with the additional option 13079b92b1d3SBarry Smith`--with-shared-libraries` (this is the default). Also, if you have room, compiling and 13089b92b1d3SBarry Smithlinking PETSc on your machine's `/tmp` disk or similar local disk, rather than over the 13099b92b1d3SBarry Smithnetwork will be much faster. 13109b92b1d3SBarry Smith 13119b92b1d3SBarry Smith### How does PETSc's `-help` option work? Why is it different for different programs? 13129b92b1d3SBarry Smith 13139b92b1d3SBarry SmithThere are 2 ways in which one interacts with the options database: 13149b92b1d3SBarry Smith 13159b92b1d3SBarry Smith- `PetscOptionsGetXXX()` where `XXX` is some type or data structure (for example 13169b92b1d3SBarry Smith `PetscOptionsGetBool()` or `PetscOptionsGetScalarArray()`). This is a classic 13179b92b1d3SBarry Smith "getter" function, which queries the command line options for a matching option name, 13189b92b1d3SBarry Smith and returns the specified value. 13199b92b1d3SBarry Smith- `PetscOptionsXXX()` where `XXX` is some type or data structure (for example 13209b92b1d3SBarry Smith `PetscOptionsBool()` or `PetscOptionsScalarArray()`). This is a so-called "provider" 13219b92b1d3SBarry Smith function. It first records the option name in an internal list of previously encountered 13229b92b1d3SBarry Smith options before calling `PetscOptionsGetXXX()` to query the status of said option. 13239b92b1d3SBarry Smith 13249b92b1d3SBarry SmithWhile users generally use the first option, developers will *always* use the second 13259b92b1d3SBarry Smith(provider) variant of functions. Thus, as the program runs, it will build up a list of 13269b92b1d3SBarry Smithencountered option names which are then printed **in the order of their appearance on the 13279b92b1d3SBarry Smithroot rank**. Different programs may take different paths through PETSc source code, so 13289b92b1d3SBarry Smiththey will encounter different providers, and therefore have different `-help` output. 13299b92b1d3SBarry Smith 13309b92b1d3SBarry Smith### PETSc has so many options for my program that it is hard to keep them straight 13319b92b1d3SBarry Smith 13329b92b1d3SBarry SmithRunning the PETSc program with the option `-help` will print out many of the options. To 13339b92b1d3SBarry Smithprint the options that have been specified within a program, employ `-options_left` to 13349b92b1d3SBarry Smithprint any options that the user specified but were not actually used by the program and 13359b92b1d3SBarry Smithall options used; this is helpful for detecting typo errors. The PETSc website has a search option, 13369b92b1d3SBarry Smithin the upper right hand corner, that quickly finds answers to most PETSc questions. 13379b92b1d3SBarry Smith 13389b92b1d3SBarry Smith### PETSc automatically handles many of the details in parallel PDE solvers. How can I understand what is really happening within my program? 13399b92b1d3SBarry Smith 13409b92b1d3SBarry SmithYou can use the option `-info` to get more details about the solution process. The 13419b92b1d3SBarry Smithoption `-log_view` provides details about the distribution of time spent in the various 13429b92b1d3SBarry Smithphases of the solution process. You can run with `-ts_view` or `-snes_view` or 13439b92b1d3SBarry Smith`-ksp_view` to see what solver options are being used. Run with `-ts_monitor`, 13449b92b1d3SBarry Smith`-snes_monitor`, or `-ksp_monitor` to watch convergence of the 13459b92b1d3SBarry Smithmethods. `-snes_converged_reason` and `-ksp_converged_reason` will indicate why and if 13469b92b1d3SBarry Smiththe solvers have converged. 13479b92b1d3SBarry Smith 13489b92b1d3SBarry Smith### Assembling large sparse matrices takes a long time. What can I do to make this process faster? Or MatSetValues() is so slow; what can I do to speed it up? 13499b92b1d3SBarry Smith 13509b92b1d3SBarry SmithYou probably need to do preallocation, as explained in {any}`sec_matsparse`. 13519b92b1d3SBarry SmithSee also the {ref}`performance chapter <ch_performance>` of the users manual. 13529b92b1d3SBarry Smith 13539b92b1d3SBarry SmithFor GPUs (and even CPUs) you can use `MatSetPreallocationCOO()` and `MatSetValuesCOO()` for more rapid assembly. 13549b92b1d3SBarry Smith 13559b92b1d3SBarry Smith### How can I generate performance summaries with PETSc? 13569b92b1d3SBarry Smith 13579b92b1d3SBarry SmithUse this option at runtime: 13589b92b1d3SBarry Smith 13599b92b1d3SBarry Smith-l 13609b92b1d3SBarry Smith 13619b92b1d3SBarry Smithog_view 13629b92b1d3SBarry Smith 13639b92b1d3SBarry SmithOutputs a comprehensive timing, memory consumption, and communications digest 13649b92b1d3SBarry Smithfor your program. See the {ref}`profiling chapter <ch_profiling>` of the users 13659b92b1d3SBarry Smithmanual for information on interpreting the summary data. 13669b92b1d3SBarry Smith 13679b92b1d3SBarry Smith### How do I know the amount of time spent on each level of the multigrid solver/preconditioner? 13689b92b1d3SBarry Smith 13699b92b1d3SBarry SmithRun with `-log_view` and `-pc_mg_log` 13709b92b1d3SBarry Smith 13719b92b1d3SBarry Smith### Where do I get the input matrices for the examples? 13729b92b1d3SBarry Smith 13739b92b1d3SBarry SmithSome examples use `$DATAFILESPATH/matrices/medium` and other files. These test matrices 13749b92b1d3SBarry Smithin PETSc binary format can be found in the [datafiles repository](https://gitlab.com/petsc/datafiles). 13759b92b1d3SBarry Smith 13769b92b1d3SBarry Smith### When I dump some matrices and vectors to binary, I seem to be generating some empty files with `.info` extensions. What's the deal with these? 13779b92b1d3SBarry Smith 13789b92b1d3SBarry SmithPETSc binary viewers put some additional information into `.info` files like matrix 13799b92b1d3SBarry Smithblock size. It is harmless but if you *really* don't like it you can use 13809b92b1d3SBarry Smith`-viewer_binary_skip_info` or `PetscViewerBinarySkipInfo()`. 13819b92b1d3SBarry Smith 13829b92b1d3SBarry Smith:::{note} 13839b92b1d3SBarry SmithYou need to call `PetscViewerBinarySkipInfo()` before 13849b92b1d3SBarry Smith`PetscViewerFileSetName()`. In other words you **cannot** use 13859b92b1d3SBarry Smith`PetscViewerBinaryOpen()` directly. 13869b92b1d3SBarry Smith::: 13879b92b1d3SBarry Smith 13889b92b1d3SBarry Smith### Why is my parallel solver slower than my sequential solver, or I have poor speed-up? 13899b92b1d3SBarry Smith 13909b92b1d3SBarry SmithThis can happen for many reasons: 13919b92b1d3SBarry Smith 13929b92b1d3SBarry Smith1. Make sure it is truly the time in `KSPSolve()` that is slower (by running the code 13939b92b1d3SBarry Smith with `-log_view`). Often the slower time is in generating the matrix or some other 13949b92b1d3SBarry Smith operation. 13959b92b1d3SBarry Smith2. There must be enough work for each process to outweigh the communication time. We 13969b92b1d3SBarry Smith recommend an absolute minimum of about 10,000 unknowns per process, better is 20,000 or 13979b92b1d3SBarry Smith more. This is even more true when using multiple GPUs, where you need to have millions 13989b92b1d3SBarry Smith of unknowns per GPU. 13999b92b1d3SBarry Smith3. Make sure the {ref}`communication speed of the parallel computer 14009b92b1d3SBarry Smith <doc_faq_general_parallel>` is good enough for parallel solvers. 14019b92b1d3SBarry Smith4. Check the number of solver iterates with the parallel solver against the sequential 14029b92b1d3SBarry Smith solver. Most preconditioners require more iterations when used on more processes, this 14039b92b1d3SBarry Smith is particularly true for block Jacobi (the default parallel preconditioner). You can 14049b92b1d3SBarry Smith try `-pc_type asm` (`PCASM`) its iterations scale a bit better for more 14059b92b1d3SBarry Smith processes. You may also consider multigrid preconditioners like `PCMG` or BoomerAMG 14069b92b1d3SBarry Smith in `PCHYPRE`. 14079b92b1d3SBarry Smith 14089b92b1d3SBarry Smith(doc_faq_pipelined)= 14099b92b1d3SBarry Smith 14109b92b1d3SBarry Smith### What steps are necessary to make the pipelined solvers execute efficiently? 14119b92b1d3SBarry Smith 14129b92b1d3SBarry SmithPipelined solvers like `KSPPGMRES`, `KSPPIPECG`, `KSPPIPECR`, and `KSPGROPPCG` may 14139b92b1d3SBarry Smithrequire special MPI configuration to effectively overlap reductions with computation. In 14149b92b1d3SBarry Smithgeneral, this requires an MPI-3 implementation, an implementation that supports multiple 14159b92b1d3SBarry Smiththreads, and use of a "progress thread". Consult the documentation from your vendor or 14169b92b1d3SBarry Smithcomputing facility for more details. 14179b92b1d3SBarry Smith 14189b92b1d3SBarry Smith- Cray MPI MPT-5.6 MPI-3, by setting `$MPICH_MAX_THREAD_SAFETY` to "multiple" 14199b92b1d3SBarry Smith for threads, plus either `$MPICH_ASYNC_PROGRESS` or 14209b92b1d3SBarry Smith `$MPICH_NEMESIS_ASYNC_PROGRESS`. E.g. 14219b92b1d3SBarry Smith ```console 14229b92b1d3SBarry Smith $ export MPICH_MAX_THREAD_SAFETY=multiple 14239b92b1d3SBarry Smith $ export MPICH_ASYNC_PROGRESS=1 14249b92b1d3SBarry Smith $ export MPICH_NEMESIS_ASYNC_PROGRESS=1 14259b92b1d3SBarry Smith ``` 14269b92b1d3SBarry Smith 14279b92b1d3SBarry Smith- MPICH version 3.0 and later implements the MPI-3 standard and the default 14289b92b1d3SBarry Smith configuration supports use of threads. Use of a progress thread is configured by 14299b92b1d3SBarry Smith setting the environment variable `$MPICH_ASYNC_PROGRESS`. E.g. 14309b92b1d3SBarry Smith ```console 14319b92b1d3SBarry Smith $ export MPICH_ASYNC_PROGRESS=1 14329b92b1d3SBarry Smith ``` 14339b92b1d3SBarry Smith 14349b92b1d3SBarry Smith### When using PETSc in single precision mode (`--with-precision=single` when running `configure`) are the operations done in single or double precision? 14359b92b1d3SBarry Smith 14369b92b1d3SBarry SmithPETSc does **NOT** do any explicit conversion of single precision to double before 14379b92b1d3SBarry Smithperforming computations; it depends on the hardware and compiler for what happens. For 14389b92b1d3SBarry Smithexample, the compiler could choose to put the single precision numbers into the usual 14399b92b1d3SBarry Smithdouble precision registers and then use the usual double precision floating point unit. Or 14409b92b1d3SBarry Smithit could use SSE2 instructions that work directly on the single precision numbers. It is a 14419b92b1d3SBarry Smithbit of a mystery what decisions get made sometimes. There may be compiler flags in some 14429b92b1d3SBarry Smithcircumstances that can affect this. 14439b92b1d3SBarry Smith 14449b92b1d3SBarry Smith### Why is Newton's method (SNES) not converging, or converges slowly? 14459b92b1d3SBarry Smith 14469b92b1d3SBarry SmithNewton's method may not converge for many reasons, here are some of the most common: 14479b92b1d3SBarry Smith 14489b92b1d3SBarry Smith- The Jacobian is wrong (or correct in sequential but not in parallel). 14499b92b1d3SBarry Smith- The linear system is {ref}`not solved <doc_faq_execution_kspconv>` or is not solved 14509b92b1d3SBarry Smith accurately enough. 14519b92b1d3SBarry Smith- The Jacobian system has a singularity that the linear solver is not handling. 14529b92b1d3SBarry Smith- There is a bug in the function evaluation routine. 14539b92b1d3SBarry Smith- The function is not continuous or does not have continuous first derivatives (e.g. phase 14549b92b1d3SBarry Smith change or TVD limiters). 14559b92b1d3SBarry Smith- The equations may not have a solution (e.g. limit cycle instead of a steady state) or 14569b92b1d3SBarry Smith there may be a "hill" between the initial guess and the steady state (e.g. reactants 14579b92b1d3SBarry Smith must ignite and burn before reaching a steady state, but the steady-state residual will 14589b92b1d3SBarry Smith be larger during combustion). 14599b92b1d3SBarry Smith 14609b92b1d3SBarry SmithHere are some of the ways to help debug lack of convergence of Newton: 14619b92b1d3SBarry Smith 14629b92b1d3SBarry Smith- Run on one processor to see if the problem is only in parallel. 14639b92b1d3SBarry Smith 14649b92b1d3SBarry Smith- Run with `-info` to get more detailed information on the solution process. 14659b92b1d3SBarry Smith 14669b92b1d3SBarry Smith- Run with the options 14679b92b1d3SBarry Smith 14689b92b1d3SBarry Smith ```text 14699b92b1d3SBarry Smith -snes_monitor -ksp_monitor_true_residual -snes_converged_reason -ksp_converged_reason 14709b92b1d3SBarry Smith ``` 14719b92b1d3SBarry Smith 14729b92b1d3SBarry Smith - If the linear solve does not converge, check if the Jacobian is correct, then see 14739b92b1d3SBarry Smith {ref}`this question <doc_faq_execution_kspconv>`. 14749b92b1d3SBarry Smith - If the preconditioned residual converges, but the true residual does not, the 14759b92b1d3SBarry Smith preconditioner may be singular. 14769b92b1d3SBarry Smith - If the linear solve converges well, but the line search fails, the Jacobian may be 14779b92b1d3SBarry Smith incorrect. 14789b92b1d3SBarry Smith 14799b92b1d3SBarry Smith- Run with `-pc_type lu` or `-pc_type svd` to see if the problem is a poor linear 14809b92b1d3SBarry Smith solver. 14819b92b1d3SBarry Smith 14829b92b1d3SBarry Smith- Run with `-mat_view` or `-mat_view draw` to see if the Jacobian looks reasonable. 14839b92b1d3SBarry Smith 14849b92b1d3SBarry Smith- Run with `-snes_test_jacobian -snes_test_jacobian_view` to see if the Jacobian you are 14859b92b1d3SBarry Smith using is wrong. Compare the output when you add `-mat_fd_type ds` to see if the result 14869b92b1d3SBarry Smith is sensitive to the choice of differencing parameter. 14879b92b1d3SBarry Smith 14889b92b1d3SBarry Smith- Run with `-snes_mf_operator -pc_type lu` to see if the Jacobian you are using is 14899b92b1d3SBarry Smith wrong. If the problem is too large for a direct solve, try 14909b92b1d3SBarry Smith 14919b92b1d3SBarry Smith ```text 14929b92b1d3SBarry Smith -snes_mf_operator -pc_type ksp -ksp_ksp_rtol 1e-12. 14939b92b1d3SBarry Smith ``` 14949b92b1d3SBarry Smith 14959b92b1d3SBarry Smith Compare the output when you add `-mat_mffd_type ds` to see if the result is sensitive 14969b92b1d3SBarry Smith to choice of differencing parameter. 14979b92b1d3SBarry Smith 14989b92b1d3SBarry Smith- Run with `-snes_linesearch_monitor` to see if the line search is failing (this is 14999b92b1d3SBarry Smith usually a sign of a bad Jacobian). Use `-info` in PETSc 3.1 and older versions, 15009b92b1d3SBarry Smith `-snes_ls_monitor` in PETSc 3.2 and `-snes_linesearch_monitor` in PETSc 3.3 and 15019b92b1d3SBarry Smith later. 15029b92b1d3SBarry Smith 15039b92b1d3SBarry SmithHere are some ways to help the Newton process if everything above checks out: 15049b92b1d3SBarry Smith 15059b92b1d3SBarry Smith- Run with grid sequencing (`-snes_grid_sequence` if working with a `DM` is all you 15069b92b1d3SBarry Smith need) to generate better initial guess on your finer mesh. 15079b92b1d3SBarry Smith 15089b92b1d3SBarry Smith- Run with quad precision, i.e. 15099b92b1d3SBarry Smith 15109b92b1d3SBarry Smith ```console 15119b92b1d3SBarry Smith $ ./configure --with-precision=__float128 --download-f2cblaslapack 15129b92b1d3SBarry Smith ``` 15139b92b1d3SBarry Smith 15149b92b1d3SBarry Smith :::{note} 15159b92b1d3SBarry Smith quad precision requires PETSc 3.2 and later and recent versions of the GNU compilers. 15169b92b1d3SBarry Smith ::: 15179b92b1d3SBarry Smith 15189b92b1d3SBarry Smith- Change the units (nondimensionalization), boundary condition scaling, or formulation so 15199b92b1d3SBarry Smith that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and 15209b92b1d3SBarry Smith Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127). 15219b92b1d3SBarry Smith 15229b92b1d3SBarry Smith- Mollify features in the function that do not have continuous first derivatives (often 15239b92b1d3SBarry Smith occurs when there are "if" statements in the residual evaluation, e.g. phase change or 15249b92b1d3SBarry Smith TVD limiters). Use a variational inequality solver (`SNESVINEWTONRSLS`) if the 15259b92b1d3SBarry Smith discontinuities are of fundamental importance. 15269b92b1d3SBarry Smith 15279b92b1d3SBarry Smith- Try a trust region method (`-ts_type tr`, may have to adjust parameters). 15289b92b1d3SBarry Smith 15299b92b1d3SBarry Smith- Run with some continuation parameter from a point where you know the solution, see 15309b92b1d3SBarry Smith `TSPSEUDO` for steady-states. 15319b92b1d3SBarry Smith 15329b92b1d3SBarry Smith- There are homotopy solver packages like PHCpack that can get you all possible solutions 15339b92b1d3SBarry Smith (and tell you that it has found them all) but those are not scalable and cannot solve 15349b92b1d3SBarry Smith anything but small problems. 15359b92b1d3SBarry Smith 15369b92b1d3SBarry Smith(doc_faq_execution_kspconv)= 15379b92b1d3SBarry Smith 15389b92b1d3SBarry Smith### Why is the linear solver (KSP) not converging, or converges slowly? 15399b92b1d3SBarry Smith 15409b92b1d3SBarry Smith:::{tip} 15419b92b1d3SBarry SmithAlways run with `-ksp_converged_reason -ksp_monitor_true_residual` when trying to 15429b92b1d3SBarry Smithlearn why a method is not converging! 15439b92b1d3SBarry Smith::: 15449b92b1d3SBarry Smith 15459b92b1d3SBarry SmithCommon reasons for KSP not converging are: 15469b92b1d3SBarry Smith 15479b92b1d3SBarry Smith- A symmetric method is being used for a non-symmetric problem. 15489b92b1d3SBarry Smith 15499b92b1d3SBarry Smith- The equations are singular by accident (e.g. forgot to impose boundary 15509b92b1d3SBarry Smith conditions). Check this for a small problem using `-pc_type svd -pc_svd_monitor`. 15519b92b1d3SBarry Smith 15529b92b1d3SBarry Smith- The equations are intentionally singular (e.g. constant null space), but the Krylov 15539b92b1d3SBarry Smith method was not informed, see `MatSetNullSpace()`. Always inform your local Krylov 15549b92b1d3SBarry Smith subspace solver of any change of singularity. Failure to do so will result in the 15559b92b1d3SBarry Smith immediate revocation of your computing and keyboard operator licenses, as well as 15569b92b1d3SBarry Smith a stern talking-to by the nearest Krylov Subspace Method representative. 15579b92b1d3SBarry Smith 15589b92b1d3SBarry Smith- The equations are intentionally singular and `MatSetNullSpace()` was used, but the 15599b92b1d3SBarry Smith right-hand side is not consistent. You may have to call `MatNullSpaceRemove()` on the 15609b92b1d3SBarry Smith right-hand side before calling `KSPSolve()`. See `MatSetTransposeNullSpace()`. 15619b92b1d3SBarry Smith 15629b92b1d3SBarry Smith- The equations are indefinite so that standard preconditioners don't work. Usually you 15639b92b1d3SBarry Smith will know this from the physics, but you can check with 15649b92b1d3SBarry Smith 15659b92b1d3SBarry Smith ```text 15669b92b1d3SBarry Smith -ksp_compute_eigenvalues -ksp_gmres_restart 1000 -pc_type none 15679b92b1d3SBarry Smith ``` 15689b92b1d3SBarry Smith 15699b92b1d3SBarry Smith For simple saddle point problems, try 15709b92b1d3SBarry Smith 15719b92b1d3SBarry Smith ```text 15729b92b1d3SBarry Smith -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_detect_saddle_point 15739b92b1d3SBarry Smith ``` 15749b92b1d3SBarry Smith 15759b92b1d3SBarry Smith For more difficult problems, read the literature to find robust methods and ask 15769b92b1d3SBarry Smith <mailto:petsc-users@mcs.anl.gov> or <mailto:petsc-maint@mcs.anl.gov> if you want advice about how to 15779b92b1d3SBarry Smith implement them. 15789b92b1d3SBarry Smith 15799b92b1d3SBarry Smith- If the method converges in preconditioned residual, but not in true residual, the 15809b92b1d3SBarry Smith preconditioner is likely singular or nearly so. This is common for saddle point problems 15819b92b1d3SBarry Smith (e.g. incompressible flow) or strongly nonsymmetric operators (e.g. low-Mach hyperbolic 15829b92b1d3SBarry Smith problems with large time steps). 15839b92b1d3SBarry Smith 15849b92b1d3SBarry Smith- The preconditioner is too weak or is unstable. See if `-pc_type asm -sub_pc_type lu` 15859b92b1d3SBarry Smith improves the convergence rate. If GMRES is losing too much progress in the restart, see 15869b92b1d3SBarry Smith if longer restarts help `-ksp_gmres_restart 300`. If a transpose is available, try 15879b92b1d3SBarry Smith `-ksp_type bcgs` or other methods that do not require a restart. 15889b92b1d3SBarry Smith 15899b92b1d3SBarry Smith :::{note} 15909b92b1d3SBarry Smith Unfortunately convergence with these methods is frequently erratic. 15919b92b1d3SBarry Smith ::: 15929b92b1d3SBarry Smith 15939b92b1d3SBarry Smith- The preconditioner is nonlinear (e.g. a nested iterative solve), try `-ksp_type 15949b92b1d3SBarry Smith fgmres` or `-ksp_type gcr`. 15959b92b1d3SBarry Smith 15969b92b1d3SBarry Smith- You are using geometric multigrid, but some equations (often boundary conditions) are 15979b92b1d3SBarry Smith not scaled compatibly between levels. Try `-pc_mg_galerkin` both to algebraically 15989b92b1d3SBarry Smith construct a correctly scaled coarse operator or make sure that all the equations are 15999b92b1d3SBarry Smith scaled in the same way if you want to use rediscretized coarse levels. 16009b92b1d3SBarry Smith 16019b92b1d3SBarry Smith- The matrix is very ill-conditioned. Check the {ref}`condition number <doc_faq_usage_condnum>`. 16029b92b1d3SBarry Smith 16039b92b1d3SBarry Smith - Try to improve it by choosing the relative scaling of components/boundary conditions. 16049b92b1d3SBarry Smith - Try `-ksp_diagonal_scale -ksp_diagonal_scale_fix`. 16059b92b1d3SBarry Smith - Perhaps change the formulation of the problem to produce more friendly algebraic 16069b92b1d3SBarry Smith equations. 16079b92b1d3SBarry Smith 16089b92b1d3SBarry Smith- Change the units (nondimensionalization), boundary condition scaling, or formulation so 16099b92b1d3SBarry Smith that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and 16109b92b1d3SBarry Smith Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127). 16119b92b1d3SBarry Smith 16129b92b1d3SBarry Smith- Classical Gram-Schmidt is becoming unstable, try `-ksp_gmres_modifiedgramschmidt` or 16139b92b1d3SBarry Smith use a method that orthogonalizes differently, e.g. `-ksp_type gcr`. 16149b92b1d3SBarry Smith 16159b92b1d3SBarry Smith### I get the error message: Actual argument at (1) to assumed-type dummy is of derived type with type-bound or FINAL procedures 16169b92b1d3SBarry Smith 16179b92b1d3SBarry SmithUse the following code-snippet: 16189b92b1d3SBarry Smith 16199b92b1d3SBarry Smith```fortran 16209b92b1d3SBarry Smithmodule context_module 16219b92b1d3SBarry Smith#include petsc/finclude/petsc.h 16229b92b1d3SBarry Smithuse petsc 16239b92b1d3SBarry Smithimplicit none 16249b92b1d3SBarry Smithprivate 16259b92b1d3SBarry Smithtype, public :: context_type 16269b92b1d3SBarry Smith private 16279b92b1d3SBarry Smith PetscInt :: foo 16289b92b1d3SBarry Smithcontains 16299b92b1d3SBarry Smith procedure, public :: init => context_init 16309b92b1d3SBarry Smithend type context_type 16319b92b1d3SBarry Smithcontains 16329b92b1d3SBarry Smithsubroutine context_init(self, foo) 16339b92b1d3SBarry Smith class(context_type), intent(in out) :: self 16349b92b1d3SBarry Smith PetscInt, intent(in) :: foo 16359b92b1d3SBarry Smith self%foo = foo 16369b92b1d3SBarry Smithend subroutine context_init 16379b92b1d3SBarry Smithend module context_module 16389b92b1d3SBarry Smith 16399b92b1d3SBarry Smith!------------------------------------------------------------------------ 16409b92b1d3SBarry Smith 16419b92b1d3SBarry Smithprogram test_snes 16429b92b1d3SBarry Smithuse,intrinsic :: iso_c_binding 16439b92b1d3SBarry Smithuse petsc 16449b92b1d3SBarry Smithuse context_module 16459b92b1d3SBarry Smithimplicit none 16469b92b1d3SBarry Smith 16479b92b1d3SBarry SmithSNES :: snes 16489b92b1d3SBarry Smithtype(context_type),target :: context 16499b92b1d3SBarry Smithtype(c_ptr) :: contextOut 16509b92b1d3SBarry SmithPetscErrorCode :: ierr 16519b92b1d3SBarry Smith 16529b92b1d3SBarry Smithcall PetscInitialize(PETSC_NULL_CHARACTER, ierr) 16539b92b1d3SBarry Smithcall SNESCreate(PETSC_COMM_WORLD, snes, ierr) 16549b92b1d3SBarry Smithcall context%init(1) 16559b92b1d3SBarry Smith 16569b92b1d3SBarry SmithcontextOut = c_loc(context) ! contextOut is a C pointer on context 16579b92b1d3SBarry Smith 16589b92b1d3SBarry Smithcall SNESSetConvergenceTest(snes, convergence, contextOut, PETSC_NULL_FUNCTION, ierr) 16599b92b1d3SBarry Smith 16609b92b1d3SBarry Smithcall SNESDestroy(snes, ierr) 16619b92b1d3SBarry Smithcall PetscFinalize(ierr) 16629b92b1d3SBarry Smith 16639b92b1d3SBarry Smithcontains 16649b92b1d3SBarry Smith 16659b92b1d3SBarry Smithsubroutine convergence(snes, num_iterations, xnorm, pnorm,fnorm, reason, contextIn, ierr) 16669b92b1d3SBarry SmithSNES, intent(in) :: snes 16679b92b1d3SBarry Smith 16689b92b1d3SBarry SmithPetscInt, intent(in) :: num_iterations 16699b92b1d3SBarry SmithPetscReal, intent(in) :: xnorm, pnorm, fnorm 16709b92b1d3SBarry SmithSNESConvergedReason, intent(out) :: reason 16719b92b1d3SBarry Smithtype(c_ptr), intent(in out) :: contextIn 16729b92b1d3SBarry Smithtype(context_type), pointer :: context 16739b92b1d3SBarry SmithPetscErrorCode, intent(out) :: ierr 16749b92b1d3SBarry Smith 16759b92b1d3SBarry Smithcall c_f_pointer(contextIn,context) ! convert the C pointer to a Fortran pointer to use context as in the main program 16769b92b1d3SBarry Smithreason = 0 16779b92b1d3SBarry Smithierr = 0 16789b92b1d3SBarry Smithend subroutine convergence 16799b92b1d3SBarry Smithend program test_snes 16809b92b1d3SBarry Smith``` 16819b92b1d3SBarry Smith 16829b92b1d3SBarry Smith### In C++ I get a crash on `VecDestroy()` (or some other PETSc object) at the end of the program 16839b92b1d3SBarry Smith 16849b92b1d3SBarry SmithThis can happen when the destructor for a C++ class is automatically called at the end of 16859b92b1d3SBarry Smiththe program after `PetscFinalize()`. Use the following code-snippet: 16869b92b1d3SBarry Smith 16879b92b1d3SBarry Smith``` 16889b92b1d3SBarry Smithmain() 16899b92b1d3SBarry Smith{ 16909b92b1d3SBarry Smith PetscCall(PetscInitialize()); 16919b92b1d3SBarry Smith { 16929b92b1d3SBarry Smith your variables 16939b92b1d3SBarry Smith your code 16949b92b1d3SBarry Smith 16959b92b1d3SBarry Smith ... /* all your destructors are called here automatically by C++ so they work correctly */ 16969b92b1d3SBarry Smith } 16979b92b1d3SBarry Smith PetscCall(PetscFinalize()); 16989b92b1d3SBarry Smith return 0 16999b92b1d3SBarry Smith} 17009b92b1d3SBarry Smith``` 17019b92b1d3SBarry Smith 17029b92b1d3SBarry Smith______________________________________________________________________ 17039b92b1d3SBarry Smith 17049b92b1d3SBarry Smith## Debugging 17059b92b1d3SBarry Smith 17069b92b1d3SBarry Smith### What does the message hwloc/linux: Ignoring PCI device with non-16bit domain mean? 17079b92b1d3SBarry Smith 17089b92b1d3SBarry SmithThis is printed by the hwloc library that is used by some MPI implementations. It can be ignored. 17099b92b1d3SBarry SmithTo prevent the message from always being printed set the environmental variable `HWLOC_HIDE_ERRORS` to 2. 17109b92b1d3SBarry SmithFor example 17119b92b1d3SBarry Smith 17129b92b1d3SBarry Smith``` 17139b92b1d3SBarry Smithexport HWLOC_HIDE_ERRORS=2 17149b92b1d3SBarry Smith``` 17159b92b1d3SBarry Smith 17169b92b1d3SBarry Smithwhich can be added to your login profile file such as `~/.bashrc`. 17179b92b1d3SBarry Smith 17189b92b1d3SBarry Smith### How do I turn off PETSc signal handling so I can use the `-C` Option On `xlf`? 17199b92b1d3SBarry Smith 17209b92b1d3SBarry SmithImmediately after calling `PetscInitialize()` call `PetscPopSignalHandler()`. 17219b92b1d3SBarry Smith 17229b92b1d3SBarry SmithSome Fortran compilers including the IBM xlf, xlF etc compilers have a compile option 17239b92b1d3SBarry Smith(`-C` for IBM's) that causes all array access in Fortran to be checked that they are 17249b92b1d3SBarry Smithin-bounds. This is a great feature but does require that the array dimensions be set 17259b92b1d3SBarry Smithexplicitly, not with a \*. 17269b92b1d3SBarry Smith 17279b92b1d3SBarry Smith### How do I debug if `-start_in_debugger` does not work on my machine? 17289b92b1d3SBarry Smith 17299b92b1d3SBarry SmithThe script <https://github.com/Azrael3000/tmpi> can be used to run multiple MPI 17309b92b1d3SBarry Smithranks in the debugger using tmux. 17319b92b1d3SBarry Smith 17329b92b1d3SBarry SmithOn newer macOS machines - one has to be in admin group to be able to use debugger. 17339b92b1d3SBarry Smith 17349b92b1d3SBarry SmithOn newer Ubuntu linux machines - one has to disable `ptrace_scope` with 17359b92b1d3SBarry Smith 17369b92b1d3SBarry Smith```console 17379b92b1d3SBarry Smith$ sudo echo 0 > /proc/sys/kernel/yama/ptrace_scope 17389b92b1d3SBarry Smith``` 17399b92b1d3SBarry Smith 17409b92b1d3SBarry Smithto get start in debugger working. 17419b92b1d3SBarry Smith 17429b92b1d3SBarry SmithIf `-start_in_debugger` does not work on your OS, for a uniprocessor job, just 17439b92b1d3SBarry Smithtry the debugger directly, for example: `gdb ex1`. You can also use [TotalView](https://totalview.io/products/totalview) which is a good graphical parallel debugger. 17449b92b1d3SBarry Smith 17459b92b1d3SBarry Smith### How do I see where my code is hanging? 17469b92b1d3SBarry Smith 17479b92b1d3SBarry SmithYou can use the `-start_in_debugger` option to start all processes in the debugger (each 17489b92b1d3SBarry Smithwill come up in its own xterm) or run in [TotalView](https://totalview.io/products/totalview). Then use `cont` (for continue) in each 17499b92b1d3SBarry Smithxterm. Once you are sure that the program is hanging, hit control-c in each xterm and then 17509b92b1d3SBarry Smithuse 'where' to print a stack trace for each process. 17519b92b1d3SBarry Smith 17529b92b1d3SBarry Smith### How can I inspect PETSc vector and matrix values when in the debugger? 17539b92b1d3SBarry Smith 17549b92b1d3SBarry SmithI will illustrate this with `gdb`, but it should be similar on other debuggers. You can 17559b92b1d3SBarry Smithlook at local `Vec` values directly by obtaining the array. For a `Vec` v, we can 17569b92b1d3SBarry Smithprint all local values using: 17579b92b1d3SBarry Smith 17589b92b1d3SBarry Smith```console 17599b92b1d3SBarry Smith(gdb) p ((Vec_Seq*) v->data)->array[0]@v->map.n 17609b92b1d3SBarry Smith``` 17619b92b1d3SBarry Smith 17629b92b1d3SBarry SmithHowever, this becomes much more complicated for a matrix. Therefore, it is advisable to use the default viewer to look at the object. For a `Vec` v and a `Mat` m, this would be: 17639b92b1d3SBarry Smith 17649b92b1d3SBarry Smith```console 17659b92b1d3SBarry Smith(gdb) call VecView(v, 0) 17669b92b1d3SBarry Smith(gdb) call MatView(m, 0) 17679b92b1d3SBarry Smith``` 17689b92b1d3SBarry Smith 17699b92b1d3SBarry Smithor with a communicator other than `MPI_COMM_WORLD`: 17709b92b1d3SBarry Smith 17719b92b1d3SBarry Smith```console 17729b92b1d3SBarry Smith(gdb) call MatView(m, PETSC_VIEWER_STDOUT_(m->comm)) 17739b92b1d3SBarry Smith``` 17749b92b1d3SBarry Smith 17759b92b1d3SBarry SmithTotalview 8.8.0+ has a new feature that allows libraries to provide their own code to 17769b92b1d3SBarry Smithdisplay objects in the debugger. Thus in theory each PETSc object, `Vec`, `Mat` etc 17779b92b1d3SBarry Smithcould have custom code to print values in the object. We have only done this for the most 17789b92b1d3SBarry Smithelementary display of `Vec` and `Mat`. See the routine `TV_display_type()` in 17799b92b1d3SBarry Smith`src/vec/vec/interface/vector.c` for an example of how these may be written. Contact us 17809b92b1d3SBarry Smithif you would like to add more. 17819b92b1d3SBarry Smith 17829b92b1d3SBarry Smith### How can I find the cause of floating point exceptions like not-a-number (NaN) or infinity? 17839b92b1d3SBarry Smith 17849b92b1d3SBarry SmithThe best way to locate floating point exceptions is to use a debugger. On supported 17859b92b1d3SBarry Smitharchitectures (including Linux and glibc-based systems), just run in a debugger and pass 17869b92b1d3SBarry Smith`-fp_trap` to the PETSc application. This will activate signaling exceptions and the 17879b92b1d3SBarry Smithdebugger will break on the line that first divides by zero or otherwise generates an 17889b92b1d3SBarry Smithexceptions. 17899b92b1d3SBarry Smith 17909b92b1d3SBarry SmithWithout a debugger, running with `-fp_trap` in debug mode will only identify the 17919b92b1d3SBarry Smithfunction in which the error occurred, but not the line or the type of exception. If 17929b92b1d3SBarry Smith`-fp_trap` is not supported on your architecture, consult the documentation for your 17939b92b1d3SBarry Smithdebugger since there is likely a way to have it catch exceptions. 17949b92b1d3SBarry Smith 17959b92b1d3SBarry Smith(error_libimf)= 17969b92b1d3SBarry Smith 17979b92b1d3SBarry Smith### Error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory 17989b92b1d3SBarry Smith 17999b92b1d3SBarry SmithThe Intel compilers use shared libraries (like libimf) that cannot be found, by default, at run 18009b92b1d3SBarry Smithtime. When using the Intel compilers (and running the resulting code) you must make sure 18019b92b1d3SBarry Smiththat the proper Intel initialization scripts are run. This is usually done by adding some 18029b92b1d3SBarry Smithcode into your `.cshrc`, `.bashrc`, `.profile` etc file. Sometimes on batch file 18039b92b1d3SBarry Smithsystems that do now access your initialization files (like .cshrc) you must include the 18049b92b1d3SBarry Smithinitialization calls in your batch file submission. 18059b92b1d3SBarry Smith 18069b92b1d3SBarry SmithFor example, on my Mac using `csh` I have the following in my `.cshrc` file: 18079b92b1d3SBarry Smith 18089b92b1d3SBarry Smith```csh 18099b92b1d3SBarry Smithsource /opt/intel/cc/10.1.012/bin/iccvars.csh 18109b92b1d3SBarry Smithsource /opt/intel/fc/10.1.012/bin/ifortvars.csh 18119b92b1d3SBarry Smithsource /opt/intel/idb/10.1.012/bin/idbvars.csh 18129b92b1d3SBarry Smith``` 18139b92b1d3SBarry Smith 18149b92b1d3SBarry SmithAnd in my `.profile` I have 18159b92b1d3SBarry Smith 18169b92b1d3SBarry Smith```csh 18179b92b1d3SBarry Smithsource /opt/intel/cc/10.1.012/bin/iccvars.sh 18189b92b1d3SBarry Smithsource /opt/intel/fc/10.1.012/bin/ifortvars.sh 18199b92b1d3SBarry Smithsource /opt/intel/idb/10.1.012/bin/idbvars.sh 18209b92b1d3SBarry Smith``` 18219b92b1d3SBarry Smith 18229b92b1d3SBarry Smith(object_type_not_set)= 18239b92b1d3SBarry Smith 18249b92b1d3SBarry Smith### What does "Object Type Not Set: Argument # N" Mean? 18259b92b1d3SBarry Smith 18269b92b1d3SBarry SmithMany operations on PETSc objects require that the specific type of the object be set before the operations is performed. You must call `XXXSetType()` or `XXXSetFromOptions()` before you make the offending call. For example 18279b92b1d3SBarry Smith 18289b92b1d3SBarry Smith``` 18299b92b1d3SBarry SmithMat A; 18309b92b1d3SBarry Smith 18319b92b1d3SBarry SmithPetscCall(MatCreate(PETSC_COMM_WORLD, &A)); 18329b92b1d3SBarry SmithPetscCall(MatSetValues(A,....)); 18339b92b1d3SBarry Smith``` 18349b92b1d3SBarry Smith 18359b92b1d3SBarry Smithwill not work. You must add `MatSetType()` or `MatSetFromOptions()` before the call to `MatSetValues()`. I.e. 18369b92b1d3SBarry Smith 18379b92b1d3SBarry Smith``` 18389b92b1d3SBarry SmithMat A; 18399b92b1d3SBarry Smith 18409b92b1d3SBarry SmithPetscCall(MatCreate(PETSC_COMM_WORLD, &A)); 18419b92b1d3SBarry Smith 18429b92b1d3SBarry SmithPetscCall(MatSetType(A, MATAIJ)); 18439b92b1d3SBarry Smith/* Will override MatSetType() */ 18449b92b1d3SBarry SmithPetscCall(MatSetFromOptions()); 18459b92b1d3SBarry Smith 18469b92b1d3SBarry SmithPetscCall(MatSetValues(A,....)); 18479b92b1d3SBarry Smith``` 18489b92b1d3SBarry Smith 18499b92b1d3SBarry Smith(split_ownership)= 18509b92b1d3SBarry Smith 18519b92b1d3SBarry Smith### What does error detected in `PetscSplitOwnership()` about "sum of local lengths ...": mean? 18529b92b1d3SBarry Smith 18539b92b1d3SBarry SmithIn a previous call to `VecSetSizes()`, `MatSetSizes()`, `VecCreateXXX()` or 18549b92b1d3SBarry Smith`MatCreateXXX()` you passed in local and global sizes that do not make sense for the 18559b92b1d3SBarry Smithcorrect number of processors. For example if you pass in a local size of 2 and a global 18569b92b1d3SBarry Smithsize of 100 and run on two processors, this cannot work since the sum of the local sizes 18579b92b1d3SBarry Smithis 4, not 100. 18589b92b1d3SBarry Smith 18599b92b1d3SBarry Smith(valgrind)= 18609b92b1d3SBarry Smith 18619b92b1d3SBarry Smith### What does "corrupt argument" or "caught signal" Or "SEGV" Or "segmentation violation" Or "bus error" mean? Can I use Valgrind or CUDA-Memcheck to debug memory corruption issues? 18629b92b1d3SBarry Smith 18639b92b1d3SBarry SmithSometimes it can mean an argument to a function is invalid. In Fortran this may be caused 18649b92b1d3SBarry Smithby forgetting to list an argument in the call, especially the final `PetscErrorCode`. 18659b92b1d3SBarry Smith 18669b92b1d3SBarry SmithOtherwise it is usually caused by memory corruption; that is somewhere the code is writing 18679b92b1d3SBarry Smithout of array bounds. To track this down rerun the debug version of the code with the 18689b92b1d3SBarry Smithoption `-malloc_debug`. Occasionally the code may crash only with the optimized version, 18699b92b1d3SBarry Smithin that case run the optimized version with `-malloc_debug`. If you determine the 18709b92b1d3SBarry Smithproblem is from memory corruption you can put the macro CHKMEMQ in the code near the crash 18719b92b1d3SBarry Smithto determine exactly what line is causing the problem. 18729b92b1d3SBarry Smith 18739b92b1d3SBarry SmithIf `-malloc_debug` does not help: on NVIDIA CUDA systems you can use <https://docs.nvidia.com/cuda/cuda-memcheck/index.html> 18749b92b1d3SBarry Smith 18759b92b1d3SBarry SmithIf `-malloc_debug` does not help: on GNU/Linux (not macOS machines) - you can 18769b92b1d3SBarry Smithuse [valgrind](http://valgrind.org). Follow the below instructions: 18779b92b1d3SBarry Smith 18789b92b1d3SBarry Smith1. `configure` PETSc with `--download-mpich --with-debugging` (you can use other MPI implementations but most produce spurious Valgrind messages) 18799b92b1d3SBarry Smith 18809b92b1d3SBarry Smith2. Compile your application code with this build of PETSc. 18819b92b1d3SBarry Smith 18829b92b1d3SBarry Smith3. Run with Valgrind. 18839b92b1d3SBarry Smith 18849b92b1d3SBarry Smith ```console 18859b92b1d3SBarry Smith $ $PETSC_DIR/lib/petsc/bin/petscmpiexec -valgrind -n NPROC PETSCPROGRAMNAME PROGRAMOPTIONS 18869b92b1d3SBarry Smith ``` 18879b92b1d3SBarry Smith 18889b92b1d3SBarry Smith or 18899b92b1d3SBarry Smith 18909b92b1d3SBarry Smith ```console 18919b92b1d3SBarry Smith $ mpiexec -n NPROC valgrind --tool=memcheck -q --num-callers=20 \ 18929b92b1d3SBarry Smith --suppressions=$PETSC_DIR/share/petsc/suppressions/valgrind \ 18939b92b1d3SBarry Smith --log-file=valgrind.log.%p PETSCPROGRAMNAME -malloc off PROGRAMOPTIONS 18949b92b1d3SBarry Smith ``` 18959b92b1d3SBarry Smith 18969b92b1d3SBarry Smith:::{note} 18979b92b1d3SBarry Smith- option `--with-debugging` enables valgrind to give stack trace with additional 18989b92b1d3SBarry Smith source-file:line-number info. 18999b92b1d3SBarry Smith- option `--download-mpich` is valgrind clean, other MPI builds are not valgrind clean. 19009b92b1d3SBarry Smith- when `--download-mpich` is used - mpiexec will be in `$PETSC_ARCH/bin` 19019b92b1d3SBarry Smith- `--log-file=valgrind.log.%p` option tells valgrind to store the output from each 19029b92b1d3SBarry Smith process in a different file \[as %p i.e PID, is different for each MPI process. 19039b92b1d3SBarry Smith- `memcheck` will not find certain array access that violate static array 19049b92b1d3SBarry Smith declarations so if memcheck runs clean you can try the `--tool=exp-ptrcheck` 19059b92b1d3SBarry Smith instead. 19069b92b1d3SBarry Smith::: 19079b92b1d3SBarry Smith 19089b92b1d3SBarry SmithYou might also consider using <http://drmemory.org> which has support for GNU/Linux, Apple 19099b92b1d3SBarry SmithMac OS and Microsoft Windows machines. (Note we haven't tried this ourselves). 19109b92b1d3SBarry Smith 19119b92b1d3SBarry Smith(zeropivot)= 19129b92b1d3SBarry Smith 19139b92b1d3SBarry Smith### What does "detected zero pivot in LU factorization" mean? 19149b92b1d3SBarry Smith 19159b92b1d3SBarry SmithA zero pivot in LU, ILU, Cholesky, or ICC sparse factorization does not always mean that 19169b92b1d3SBarry Smiththe matrix is singular. You can use 19179b92b1d3SBarry Smith 19189b92b1d3SBarry Smith```text 19199b92b1d3SBarry Smith-pc_factor_shift_type nonzero -pc_factor_shift_amount [amount] 19209b92b1d3SBarry Smith``` 19219b92b1d3SBarry Smith 19229b92b1d3SBarry Smithor 19239b92b1d3SBarry Smith 19249b92b1d3SBarry Smith```text 19259b92b1d3SBarry Smith-pc_factor_shift_type positive_definite -[level]_pc_factor_shift_type nonzero 19269b92b1d3SBarry Smith -pc_factor_shift_amount [amount] 19279b92b1d3SBarry Smith``` 19289b92b1d3SBarry Smith 19299b92b1d3SBarry Smithor 19309b92b1d3SBarry Smith 19319b92b1d3SBarry Smith```text 19329b92b1d3SBarry Smith-[level]_pc_factor_shift_type positive_definite 19339b92b1d3SBarry Smith``` 19349b92b1d3SBarry Smith 19359b92b1d3SBarry Smithto prevent the zero pivot. [level] is "sub" when lu, ilu, cholesky, or icc are employed in 19369b92b1d3SBarry Smitheach individual block of the bjacobi or ASM preconditioner. [level] is "mg_levels" or 19379b92b1d3SBarry Smith"mg_coarse" when lu, ilu, cholesky, or icc are used inside multigrid smoothers or to the 19389b92b1d3SBarry Smithcoarse grid solver. See `PCFactorSetShiftType()`, `PCFactorSetShiftAmount()`. 19399b92b1d3SBarry Smith 19409b92b1d3SBarry SmithThis error can also happen if your matrix is singular, see `MatSetNullSpace()` for how 19419b92b1d3SBarry Smithto handle this. If this error occurs in the zeroth row of the matrix, it is likely you 19429b92b1d3SBarry Smithhave an error in the code that generates the matrix. 19439b92b1d3SBarry Smith 19449b92b1d3SBarry Smith### You create draw windows or `PETSCVIEWERDRAW` windows or use options `-ksp_monitor draw::draw_lg` or `-snes_monitor draw::draw_lg` and the program seems to run OK but windows never open 19459b92b1d3SBarry Smith 19469b92b1d3SBarry SmithThe libraries were compiled without support for X windows. Make sure that `configure` 19479b92b1d3SBarry Smithwas run with the option `--with-x`. 19489b92b1d3SBarry Smith 19499b92b1d3SBarry Smith### The program seems to use more and more memory as it runs, even though you don't think you are allocating more memory 19509b92b1d3SBarry Smith 19519b92b1d3SBarry SmithSome of the following may be occurring: 19529b92b1d3SBarry Smith 19539b92b1d3SBarry Smith- You are creating new PETSc objects but never freeing them. 19549b92b1d3SBarry Smith- There is a memory leak in PETSc or your code. 19559b92b1d3SBarry Smith- Something much more subtle: (if you are using Fortran). When you declare a large array 19569b92b1d3SBarry Smith in Fortran, the operating system does not allocate all the memory pages for that array 19579b92b1d3SBarry Smith until you start using the different locations in the array. Thus, in a code, if at each 19589b92b1d3SBarry Smith step you start using later values in the array your virtual memory usage will "continue" 19599b92b1d3SBarry Smith to increase as measured by `ps` or `top`. 19609b92b1d3SBarry Smith- You are running with the `-log`, `-log_mpe`, or `-log_all` option. With these 19619b92b1d3SBarry Smith options, a great deal of logging information is stored in memory until the conclusion of 19629b92b1d3SBarry Smith the run. 19639b92b1d3SBarry Smith- You are linking with the MPI profiling libraries; these cause logging of all MPI 19649b92b1d3SBarry Smith activities. Another symptom is at the conclusion of the run it may print some message 19659b92b1d3SBarry Smith about writing log files. 19669b92b1d3SBarry Smith 19679b92b1d3SBarry SmithThe following may help: 19689b92b1d3SBarry Smith 19699b92b1d3SBarry Smith- Run with the `-malloc_debug` option and `-malloc_view`. Or use `PetscMallocDump()` 19709b92b1d3SBarry Smith and `PetscMallocView()` sprinkled about your code to track memory that is allocated 19719b92b1d3SBarry Smith and not later freed. Use the commands `PetscMallocGetCurrentUsage()` and 19729b92b1d3SBarry Smith `PetscMemoryGetCurrentUsage()` to monitor memory allocated and 19739b92b1d3SBarry Smith `PetscMallocGetMaximumUsage()` and `PetscMemoryGetMaximumUsage()` for total memory 19749b92b1d3SBarry Smith used as the code progresses. 19759b92b1d3SBarry Smith- This is just the way Unix works and is harmless. 19769b92b1d3SBarry Smith- Do not use the `-log`, `-log_mpe`, or `-log_all` option, or use 19779b92b1d3SBarry Smith `PetscLogEventDeactivate()` or `PetscLogEventDeactivateClass()` to turn off logging of 19789b92b1d3SBarry Smith specific events. 19799b92b1d3SBarry Smith- Make sure you do not link with the MPI profiling libraries. 19809b92b1d3SBarry Smith 19819b92b1d3SBarry Smith### When calling `MatPartitioningApply()` you get a message "Error! Key 16615 Not Found" 19829b92b1d3SBarry Smith 19839b92b1d3SBarry SmithThe graph of the matrix you are using is not symmetric. You must use symmetric matrices 19849b92b1d3SBarry Smithfor partitioning. 19859b92b1d3SBarry Smith 19869b92b1d3SBarry Smith### With GMRES at restart the second residual norm printed does not match the first 19879b92b1d3SBarry Smith 19889b92b1d3SBarry SmithI.e. 19899b92b1d3SBarry Smith 19909b92b1d3SBarry Smith```text 19919b92b1d3SBarry Smith26 KSP Residual norm 3.421544615851e-04 19929b92b1d3SBarry Smith27 KSP Residual norm 2.973675659493e-04 19939b92b1d3SBarry Smith28 KSP Residual norm 2.588642948270e-04 19949b92b1d3SBarry Smith29 KSP Residual norm 2.268190747349e-04 19959b92b1d3SBarry Smith30 KSP Residual norm 1.977245964368e-04 19969b92b1d3SBarry Smith30 KSP Residual norm 1.994426291979e-04 <----- At restart the residual norm is printed a second time 19979b92b1d3SBarry Smith``` 19989b92b1d3SBarry Smith 19999b92b1d3SBarry SmithThis is actually not surprising! GMRES computes the norm of the residual at each iteration 20009b92b1d3SBarry Smithvia a recurrence relation between the norms of the residuals at the previous iterations 20019b92b1d3SBarry Smithand quantities computed at the current iteration. It does not compute it via directly 20029b92b1d3SBarry Smith$|| b - A x^{n} ||$. 20039b92b1d3SBarry Smith 20049b92b1d3SBarry SmithSometimes, especially with an ill-conditioned matrix, or computation of the matrix-vector 20059b92b1d3SBarry Smithproduct via differencing, the residual norms computed by GMRES start to "drift" from the 20069b92b1d3SBarry Smithcorrect values. At the restart, we compute the residual norm directly, hence the "strange 20079b92b1d3SBarry Smithstuff," the difference printed. The drifting, if it remains small, is harmless (doesn't 20089b92b1d3SBarry Smithaffect the accuracy of the solution that GMRES computes). 20099b92b1d3SBarry Smith 20109b92b1d3SBarry SmithIf you use a more powerful preconditioner the drift will often be smaller and less 20119b92b1d3SBarry Smithnoticeable. Of if you are running matrix-free you may need to tune the matrix-free 20129b92b1d3SBarry Smithparameters. 20139b92b1d3SBarry Smith 20149b92b1d3SBarry Smith### Why do some Krylov methods seem to print two residual norms per iteration? 20159b92b1d3SBarry Smith 20169b92b1d3SBarry SmithI.e. 20179b92b1d3SBarry Smith 20189b92b1d3SBarry Smith```text 20199b92b1d3SBarry Smith1198 KSP Residual norm 1.366052062216e-04 20209b92b1d3SBarry Smith1198 KSP Residual norm 1.931875025549e-04 20219b92b1d3SBarry Smith1199 KSP Residual norm 1.366026406067e-04 20229b92b1d3SBarry Smith1199 KSP Residual norm 1.931819426344e-04 20239b92b1d3SBarry Smith``` 20249b92b1d3SBarry Smith 20259b92b1d3SBarry SmithSome Krylov methods, for example `KSPTFQMR`, actually have a "sub-iteration" of size 2 20269b92b1d3SBarry Smithinside the loop. Each of the two substeps has its own matrix vector product and 20279b92b1d3SBarry Smithapplication of the preconditioner and updates the residual approximations. This is why you 20289b92b1d3SBarry Smithget this "funny" output where it looks like there are two residual norms per 20299b92b1d3SBarry Smithiteration. You can also think of it as twice as many iterations. 20309b92b1d3SBarry Smith 20319b92b1d3SBarry Smith### Unable to locate PETSc dynamic library `libpetsc` 20329b92b1d3SBarry Smith 20339b92b1d3SBarry SmithWhen using DYNAMIC libraries - the libraries cannot be moved after they are 20349b92b1d3SBarry Smithinstalled. This could also happen on clusters - where the paths are different on the (run) 20359b92b1d3SBarry Smithnodes - than on the (compile) front-end. **Do not use dynamic libraries & shared 20369b92b1d3SBarry Smithlibraries**. Run `configure` with 20379b92b1d3SBarry Smith`--with-shared-libraries=0 --with-dynamic-loading=0`. 20389b92b1d3SBarry Smith 20399b92b1d3SBarry Smith:::{important} 2040f0b74427SPierre JolivetThis option has been removed in PETSc v3.5 20419b92b1d3SBarry Smith::: 20429b92b1d3SBarry Smith 20439b92b1d3SBarry Smith### How do I determine what update to PETSc broke my code? 20449b92b1d3SBarry Smith 20459b92b1d3SBarry SmithIf at some point (in PETSc code history) you had a working code - but the latest PETSc 20469b92b1d3SBarry Smithcode broke it, its possible to determine the PETSc code change that might have caused this 20479b92b1d3SBarry Smithbehavior. This is achieved by: 20489b92b1d3SBarry Smith 20499b92b1d3SBarry Smith- Using Git to access PETSc sources 20509b92b1d3SBarry Smith- Knowing the Git commit for the known working version of PETSc 20519b92b1d3SBarry Smith- Knowing the Git commit for the known broken version of PETSc 20529b92b1d3SBarry Smith- Using the [bisect](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html) 20539b92b1d3SBarry Smith functionality of Git 20549b92b1d3SBarry Smith 20559b92b1d3SBarry SmithGit bisect can be done as follows: 20569b92b1d3SBarry Smith 20579b92b1d3SBarry Smith1. Get PETSc development (main branch in git) sources 20589b92b1d3SBarry Smith 20599b92b1d3SBarry Smith ```console 20609b92b1d3SBarry Smith $ git clone https://gitlab.com/petsc/petsc.git 20619b92b1d3SBarry Smith ``` 20629b92b1d3SBarry Smith 20639b92b1d3SBarry Smith2. Find the good and bad markers to start the bisection process. This can be done either 20649b92b1d3SBarry Smith by checking `git log` or `gitk` or <https://gitlab.com/petsc/petsc> or the web 20659b92b1d3SBarry Smith history of petsc-release clones. Lets say the known bad commit is 21af4baa815c and 20669b92b1d3SBarry Smith known good commit is 5ae5ab319844. 20679b92b1d3SBarry Smith 20689b92b1d3SBarry Smith3. Start the bisection process with these known revisions. Build PETSc, and test your code 20699b92b1d3SBarry Smith to confirm known good/bad behavior: 20709b92b1d3SBarry Smith 20719b92b1d3SBarry Smith ```console 20729b92b1d3SBarry Smith $ git bisect start 21af4baa815c 5ae5ab319844 20739b92b1d3SBarry Smith ``` 20749b92b1d3SBarry Smith 20759b92b1d3SBarry Smith build/test, perhaps discover that this new state is bad 20769b92b1d3SBarry Smith 20779b92b1d3SBarry Smith ```console 20789b92b1d3SBarry Smith $ git bisect bad 20799b92b1d3SBarry Smith ``` 20809b92b1d3SBarry Smith 20819b92b1d3SBarry Smith build/test, perhaps discover that this state is good 20829b92b1d3SBarry Smith 20839b92b1d3SBarry Smith ```console 20849b92b1d3SBarry Smith $ git bisect good 20859b92b1d3SBarry Smith ``` 20869b92b1d3SBarry Smith 20879b92b1d3SBarry Smith Now until done - keep bisecting, building PETSc, and testing your code with it and 20889b92b1d3SBarry Smith determine if the code is working or not. After something like 5-15 iterations, `git 20899b92b1d3SBarry Smith bisect` will pin-point the exact code change that resulted in the difference in 20909b92b1d3SBarry Smith application behavior. 20919b92b1d3SBarry Smith 20929b92b1d3SBarry Smith:::{tip} 20939b92b1d3SBarry SmithSee [git-bisect(1)](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html) and the 20949b92b1d3SBarry Smith[debugging section of the Git Book](https://git-scm.com/book/en/Git-Tools-Debugging-with-Git) for more debugging tips. 20959b92b1d3SBarry Smith::: 20969b92b1d3SBarry Smith 20979b92b1d3SBarry Smith### How to fix the error "PMIX Error: error in file gds_ds12_lock_pthread.c"? 20989b92b1d3SBarry Smith 20999b92b1d3SBarry SmithThis seems to be an error when using Open MPI and OpenBLAS with threads (or perhaps other 21009b92b1d3SBarry Smithpackages that use threads). 21019b92b1d3SBarry Smith 21029b92b1d3SBarry Smith```console 21039b92b1d3SBarry Smith$ export PMIX_MCA_gds=hash 21049b92b1d3SBarry Smith``` 21059b92b1d3SBarry Smith 21069b92b1d3SBarry SmithShould resolve the problem. 21079b92b1d3SBarry Smith 21089b92b1d3SBarry Smith______________________________________________________________________ 21099b92b1d3SBarry Smith 21109b92b1d3SBarry Smith(doc_faq_sharedlibs)= 21119b92b1d3SBarry Smith 21129b92b1d3SBarry Smith## Shared Libraries 21139b92b1d3SBarry Smith 21149b92b1d3SBarry Smith### Can I install PETSc libraries as shared libraries? 21159b92b1d3SBarry Smith 21169b92b1d3SBarry SmithYes. Use 21179b92b1d3SBarry Smith 21189b92b1d3SBarry Smith```console 21199b92b1d3SBarry Smith$ ./configure --with-shared-libraries 21209b92b1d3SBarry Smith``` 21219b92b1d3SBarry Smith 21229b92b1d3SBarry Smith### Why should I use shared libraries? 21239b92b1d3SBarry Smith 21249b92b1d3SBarry SmithWhen you link to shared libraries, the function symbols from the shared libraries are not 21259b92b1d3SBarry Smithcopied in the executable. This way the size of the executable is considerably smaller than 21269b92b1d3SBarry Smithwhen using regular libraries. This helps in a couple of ways: 21279b92b1d3SBarry Smith 21289b92b1d3SBarry Smith- Saves disk space when more than one executable is created 21299b92b1d3SBarry Smith- Improves the compile time immensely, because the compiler has to write a much smaller 21309b92b1d3SBarry Smith file (executable) to the disk. 21319b92b1d3SBarry Smith 21329b92b1d3SBarry Smith### How do I link to the PETSc shared libraries? 21339b92b1d3SBarry Smith 21349b92b1d3SBarry SmithBy default, the compiler should pick up the shared libraries instead of the regular 21359b92b1d3SBarry Smithones. Nothing special should be done for this. 21369b92b1d3SBarry Smith 21379b92b1d3SBarry Smith### What if I want to link to the regular `.a` library files? 21389b92b1d3SBarry Smith 21399b92b1d3SBarry SmithYou must run `configure` with the option `--with-shared-libraries=0` (you can use a 21409b92b1d3SBarry Smithdifferent `$PETSC_ARCH` for this build so you can easily switch between the two). 21419b92b1d3SBarry Smith 21429b92b1d3SBarry Smith### What do I do if I want to move my executable to a different machine? 21439b92b1d3SBarry Smith 21449b92b1d3SBarry SmithYou would also need to have access to the shared libraries on this new machine. The other 21459b92b1d3SBarry Smithalternative is to build the executable without shared libraries by first deleting the 21469b92b1d3SBarry Smithshared libraries, and then creating the executable. 21479b92b1d3SBarry Smith 21489b92b1d3SBarry Smith```{bibliography} /petsc.bib 21499b92b1d3SBarry Smith:filter: docname in docnames 21509b92b1d3SBarry Smith``` 2151