1*9b92b1d3SBarry Smith(doc_faq)= 2*9b92b1d3SBarry Smith 3*9b92b1d3SBarry Smith# FAQ 4*9b92b1d3SBarry Smith 5*9b92b1d3SBarry Smith```{contents} Table Of Contents 6*9b92b1d3SBarry Smith:backlinks: top 7*9b92b1d3SBarry Smith:local: true 8*9b92b1d3SBarry Smith``` 9*9b92b1d3SBarry Smith 10*9b92b1d3SBarry Smith______________________________________________________________________ 11*9b92b1d3SBarry Smith 12*9b92b1d3SBarry Smith## General 13*9b92b1d3SBarry Smith 14*9b92b1d3SBarry Smith### How can I subscribe to the PETSc mailing lists? 15*9b92b1d3SBarry Smith 16*9b92b1d3SBarry SmithSee mailing list {ref}`documentation <doc_mail>` 17*9b92b1d3SBarry Smith 18*9b92b1d3SBarry Smith### Any useful books on numerical computing? 19*9b92b1d3SBarry Smith 20*9b92b1d3SBarry Smith[Bueler, PETSc for Partial Differential Equations: Numerical Solutions in C and Python](https://my.siam.org/Store/Product/viewproduct/?ProductId=32850137) 21*9b92b1d3SBarry Smith 22*9b92b1d3SBarry Smith[Oliveira and Stewart, Writing Scientific Software: A Guide to Good Style](https://www.cambridge.org/core/books/writing-scientific-software/23206704175AF868E43FE3FB399C2F53) 23*9b92b1d3SBarry Smith 24*9b92b1d3SBarry Smith(doc_faq_general_parallel)= 25*9b92b1d3SBarry Smith 26*9b92b1d3SBarry Smith### What kind of parallel computers or clusters are needed to use PETSc? Or why do I get little speedup? 27*9b92b1d3SBarry Smith 28*9b92b1d3SBarry Smith:::{important} 29*9b92b1d3SBarry SmithPETSc can be used with any kind of parallel system that supports MPI BUT for any decent 30*9b92b1d3SBarry Smithperformance one needs: 31*9b92b1d3SBarry Smith 32*9b92b1d3SBarry Smith- Fast, **low-latency** interconnect; any ethernet (even 10 GigE) simply cannot provide 33*9b92b1d3SBarry Smith the needed performance. 34*9b92b1d3SBarry Smith- High per-core **memory** performance. Each core needs to 35*9b92b1d3SBarry Smith have its **own** memory bandwidth of at least 2 or more gigabytes/second. Most modern 36*9b92b1d3SBarry Smith computers are not bottlenecked by how fast they can perform 37*9b92b1d3SBarry Smith calculations; rather, they are usually restricted by how quickly they can get their 38*9b92b1d3SBarry Smith data. 39*9b92b1d3SBarry Smith::: 40*9b92b1d3SBarry Smith 41*9b92b1d3SBarry SmithTo obtain good performance it is important that you know your machine! I.e. how many 42*9b92b1d3SBarry Smithcompute nodes (generally, how many motherboards), how many memory sockets per node and how 43*9b92b1d3SBarry Smithmany cores per memory socket and how much memory bandwidth for each. 44*9b92b1d3SBarry Smith 45*9b92b1d3SBarry SmithIf you do not know this and can run MPI programs with mpiexec (that is, you don't have 46*9b92b1d3SBarry Smithbatch system), run the following from `$PETSC_DIR`: 47*9b92b1d3SBarry Smith 48*9b92b1d3SBarry Smith```console 49*9b92b1d3SBarry Smith$ make streams [NPMAX=maximum_number_of_mpi_processes_you_plan_to_use] 50*9b92b1d3SBarry Smith``` 51*9b92b1d3SBarry Smith 52*9b92b1d3SBarry SmithThis will provide a summary of the bandwidth with different number of MPI 53*9b92b1d3SBarry Smithprocesses and potential speedups. See {any}`ch_streams` for a detailed discussion. 54*9b92b1d3SBarry Smith 55*9b92b1d3SBarry SmithIf you have a batch system: 56*9b92b1d3SBarry Smith 57*9b92b1d3SBarry Smith```console 58*9b92b1d3SBarry Smith$ cd $PETSC_DIR/src/benchmarks/streams 59*9b92b1d3SBarry Smith$ make MPIVersion 60*9b92b1d3SBarry Smithsubmit MPIVersion to the batch system a number of times with 1, 2, 3, etc. MPI processes 61*9b92b1d3SBarry Smithcollecting all of the output from the runs into the single file scaling.log. Copy 62*9b92b1d3SBarry Smithscaling.log into the src/benchmarks/streams directory. 63*9b92b1d3SBarry Smith$ ./process.py createfile ; process.py 64*9b92b1d3SBarry Smith``` 65*9b92b1d3SBarry Smith 66*9b92b1d3SBarry SmithEven if you have enough memory bandwidth if the OS switches processes between cores 67*9b92b1d3SBarry Smithperformance can degrade. Smart process to core/socket binding (this just means locking a 68*9b92b1d3SBarry Smithprocess to a particular core or memory socket) may help you. For example, consider using 69*9b92b1d3SBarry Smithfewer processes than cores and binding processes to separate sockets so that each process 70*9b92b1d3SBarry Smithuses a different memory bus: 71*9b92b1d3SBarry Smith 72*9b92b1d3SBarry Smith- [MPICH2 binding with the Hydra process manager](https://github.com/pmodels/mpich/blob/main/doc/wiki/how_to/Using_the_Hydra_Process_Manager.md#process-core-binding] 73*9b92b1d3SBarry Smith 74*9b92b1d3SBarry Smith ```console 75*9b92b1d3SBarry Smith $ mpiexec.hydra -n 4 --binding cpu:sockets 76*9b92b1d3SBarry Smith ``` 77*9b92b1d3SBarry Smith 78*9b92b1d3SBarry Smith- [Open MPI binding](https://www.open-mpi.org/faq/?category=tuning#using-paffinity) 79*9b92b1d3SBarry Smith 80*9b92b1d3SBarry Smith ```console 81*9b92b1d3SBarry Smith $ mpiexec -n 4 --map-by socket --bind-to socket --report-bindings 82*9b92b1d3SBarry Smith ``` 83*9b92b1d3SBarry Smith 84*9b92b1d3SBarry Smith- `taskset`, part of the [util-linux](https://github.com/karelzak/util-linux) package 85*9b92b1d3SBarry Smith 86*9b92b1d3SBarry Smith Check `man taskset` for details. Make sure to set affinity for **your** program, 87*9b92b1d3SBarry Smith **not** for the `mpiexec` program. 88*9b92b1d3SBarry Smith 89*9b92b1d3SBarry Smith- `numactl` 90*9b92b1d3SBarry Smith 91*9b92b1d3SBarry Smith In addition to task affinity, this tool also allows changing the default memory affinity 92*9b92b1d3SBarry Smith policy. On Linux, the default policy is to attempt to find memory on the same memory bus 93*9b92b1d3SBarry Smith that serves the core that a thread is running on when the memory is faulted 94*9b92b1d3SBarry Smith (not when `malloc()` is called). If local memory is not available, it is found 95*9b92b1d3SBarry Smith elsewhere, possibly leading to serious memory imbalances. 96*9b92b1d3SBarry Smith 97*9b92b1d3SBarry Smith The option `--localalloc` allocates memory on the local NUMA node, similar to the 98*9b92b1d3SBarry Smith `numa_alloc_local()` function in the `libnuma` library. The option 99*9b92b1d3SBarry Smith `--cpunodebind=nodes` binds the process to a given NUMA node (note that this can be 100*9b92b1d3SBarry Smith larger or smaller than a CPU (socket); a NUMA node usually has multiple cores). 101*9b92b1d3SBarry Smith 102*9b92b1d3SBarry Smith The option `--physcpubind=cpus` binds the process to a given processor core (numbered 103*9b92b1d3SBarry Smith according to `/proc/cpuinfo`, therefore including logical cores if Hyper-threading is 104*9b92b1d3SBarry Smith enabled). 105*9b92b1d3SBarry Smith 106*9b92b1d3SBarry Smith With Open MPI, you can use knowledge of the NUMA hierarchy and core numbering on your 107*9b92b1d3SBarry Smith machine to calculate the correct NUMA node or processor number given the environment 108*9b92b1d3SBarry Smith variable `OMPI_COMM_WORLD_LOCAL_RANK`. In most cases, it is easier to make mpiexec or 109*9b92b1d3SBarry Smith a resource manager set affinities. 110*9b92b1d3SBarry Smith 111*9b92b1d3SBarry SmithThe software [Open-MX](http://open-mx.gforge.inria.fr) provides faster speed for 112*9b92b1d3SBarry Smithethernet systems, we have not tried it but it claims it can dramatically reduce latency 113*9b92b1d3SBarry Smithand increase bandwidth on Linux system. You must first install this software and then 114*9b92b1d3SBarry Smithinstall MPICH or Open MPI to use it. 115*9b92b1d3SBarry Smith 116*9b92b1d3SBarry Smith### What kind of license is PETSc released under? 117*9b92b1d3SBarry Smith 118*9b92b1d3SBarry SmithSee licensing {ref}`documentation <doc_license>` 119*9b92b1d3SBarry Smith 120*9b92b1d3SBarry Smith### Why is PETSc written in C, instead of Fortran or C++? 121*9b92b1d3SBarry Smith 122*9b92b1d3SBarry SmithWhen this decision was made, in the early 1990s, C enabled us to build data structures 123*9b92b1d3SBarry Smithfor storing sparse matrices, solver information, 124*9b92b1d3SBarry Smithetc. in ways that Fortran simply did not allow. ANSI C was a complete standard that all 125*9b92b1d3SBarry Smithmodern C compilers supported. The language was identical on all machines. C++ was still 126*9b92b1d3SBarry Smithevolving and compilers on different machines were not identical. Using C function pointers 127*9b92b1d3SBarry Smithto provide data encapsulation and polymorphism allowed us to get many of the advantages of 128*9b92b1d3SBarry SmithC++ without using such a large and more complicated language. It would have been natural and 129*9b92b1d3SBarry Smithreasonable to have coded PETSc in C++; we opted to use C instead. 130*9b92b1d3SBarry Smith 131*9b92b1d3SBarry Smith### Does all the PETSc error checking and logging reduce PETSc's efficiency? 132*9b92b1d3SBarry Smith 133*9b92b1d3SBarry SmithNo 134*9b92b1d3SBarry Smith 135*9b92b1d3SBarry Smith(doc_faq_maintenance_strats)= 136*9b92b1d3SBarry Smith 137*9b92b1d3SBarry Smith### How do such a small group of people manage to write and maintain such a large and marvelous package as PETSc? 138*9b92b1d3SBarry Smith 139*9b92b1d3SBarry Smith1. **We work very efficiently**. 140*9b92b1d3SBarry Smith 141*9b92b1d3SBarry Smith - We use powerful editors and programming environments. 142*9b92b1d3SBarry Smith - Our manual pages are generated automatically from formatted comments in the code, 143*9b92b1d3SBarry Smith thus alleviating the need for creating and maintaining manual pages. 144*9b92b1d3SBarry Smith - We employ continuous integration testing of the entire PETSc library on many different 145*9b92b1d3SBarry Smith machine architectures. This process **significantly** protects (no bug-catching 146*9b92b1d3SBarry Smith process is perfect) against inadvertently introducing bugs with new additions. Every 147*9b92b1d3SBarry Smith new feature **must** pass our suite of thousands of tests as well as formal code 148*9b92b1d3SBarry Smith review before it may be included. 149*9b92b1d3SBarry Smith 150*9b92b1d3SBarry Smith2. **We are very careful in our design (and are constantly revising our design)** 151*9b92b1d3SBarry Smith 152*9b92b1d3SBarry Smith - PETSc as a package should be easy to use, write, and maintain. Our mantra is to write 153*9b92b1d3SBarry Smith code like everyone is using it. 154*9b92b1d3SBarry Smith 155*9b92b1d3SBarry Smith3. **We are willing to do the grunt work** 156*9b92b1d3SBarry Smith 157*9b92b1d3SBarry Smith - PETSc is regularly checked to make sure that all code conforms to our interface 158*9b92b1d3SBarry Smith design. We will never keep in a bad design decision simply because changing it will 159*9b92b1d3SBarry Smith require a lot of editing; we do a lot of editing. 160*9b92b1d3SBarry Smith 161*9b92b1d3SBarry Smith4. **We constantly seek out and experiment with new design ideas** 162*9b92b1d3SBarry Smith 163*9b92b1d3SBarry Smith - We retain the useful ones and discard the rest. All of these decisions are based not 164*9b92b1d3SBarry Smith just on performance, but also on **practicality**. 165*9b92b1d3SBarry Smith 166*9b92b1d3SBarry Smith5. **Function and variable names must adhere to strict guidelines** 167*9b92b1d3SBarry Smith 168*9b92b1d3SBarry Smith - Even the rules about capitalization are designed to make it easy to figure out the 169*9b92b1d3SBarry Smith name of a particular object or routine. Our memories are terrible, so careful 170*9b92b1d3SBarry Smith consistent naming puts less stress on our limited human RAM. 171*9b92b1d3SBarry Smith 172*9b92b1d3SBarry Smith6. **The PETSc directory tree is designed to make it easy to move throughout the 173*9b92b1d3SBarry Smith entire package** 174*9b92b1d3SBarry Smith 175*9b92b1d3SBarry Smith7. **We have a rich, robust, and fast bug reporting system** 176*9b92b1d3SBarry Smith 177*9b92b1d3SBarry Smith - <mailto:petsc-maint@mcs.anl.gov> is always checked, and we pride ourselves on responding 178*9b92b1d3SBarry Smith quickly and accurately. Email is very lightweight, and so bug reports system retains 179*9b92b1d3SBarry Smith an archive of all reported problems and fixes, so it is easy to re-find fixes to 180*9b92b1d3SBarry Smith previously discovered problems. 181*9b92b1d3SBarry Smith 182*9b92b1d3SBarry Smith8. **We contain the complexity of PETSc by using powerful object-oriented programming 183*9b92b1d3SBarry Smith techniques** 184*9b92b1d3SBarry Smith 185*9b92b1d3SBarry Smith - Data encapsulation serves to abstract complex data formats or movement to 186*9b92b1d3SBarry Smith human-readable format. This is why your program cannot, for example, look directly 187*9b92b1d3SBarry Smith at what is inside the object `Mat`. 188*9b92b1d3SBarry Smith - Polymorphism makes changing program behavior as easy as possible, and further 189*9b92b1d3SBarry Smith abstracts the *intent* of your program from what is *written* in code. You call 190*9b92b1d3SBarry Smith `MatMult()` regardless of whether your matrix is dense, sparse, parallel or 191*9b92b1d3SBarry Smith sequential; you don't call a different routine for each format. 192*9b92b1d3SBarry Smith 193*9b92b1d3SBarry Smith9. **We try to provide the functionality requested by our users** 194*9b92b1d3SBarry Smith 195*9b92b1d3SBarry Smith### For complex numbers will I get better performance with C++? 196*9b92b1d3SBarry Smith 197*9b92b1d3SBarry SmithTo use PETSc with complex numbers you may use the following `configure` options; 198*9b92b1d3SBarry Smith`--with-scalar-type=complex` and either `--with-clanguage=c++` or (the default) 199*9b92b1d3SBarry Smith`--with-clanguage=c`. In our experience they will deliver very similar performance 200*9b92b1d3SBarry Smith(speed), but if one is concerned they should just try both and see if one is faster. 201*9b92b1d3SBarry Smith 202*9b92b1d3SBarry Smith### How come when I run the same program on the same number of processes I get a "different" answer? 203*9b92b1d3SBarry Smith 204*9b92b1d3SBarry SmithInner products and norms in PETSc are computed using the `MPI_Allreduce()` command. For 205*9b92b1d3SBarry Smithdifferent runs the order at which values arrive at a given process (via MPI) can be in a 206*9b92b1d3SBarry Smithdifferent order, thus the order in which some floating point arithmetic operations are 207*9b92b1d3SBarry Smithperformed will be different. Since floating point arithmetic is not 208*9b92b1d3SBarry Smithassociative, the computed quantity may be slightly different. 209*9b92b1d3SBarry Smith 210*9b92b1d3SBarry SmithOver a run the many slight differences in the inner products and norms will effect all the 211*9b92b1d3SBarry Smithcomputed results. It is important to realize that none of the computed answers are any 212*9b92b1d3SBarry Smithless right or wrong (in fact the sequential computation is no more right then the parallel 213*9b92b1d3SBarry Smithones). All answers are equal, but some are more equal than others. 214*9b92b1d3SBarry Smith 215*9b92b1d3SBarry SmithThe discussion above assumes that the exact same algorithm is being used on the different 216*9b92b1d3SBarry Smithnumber of processes. When the algorithm is different for the different number of processes 217*9b92b1d3SBarry Smith(almost all preconditioner algorithms except Jacobi are different for different number of 218*9b92b1d3SBarry Smithprocesses) then one expects to see (and does) a greater difference in results for 219*9b92b1d3SBarry Smithdifferent numbers of processes. In some cases (for example block Jacobi preconditioner) it 220*9b92b1d3SBarry Smithmay be that the algorithm works for some number of processes and does not work for others. 221*9b92b1d3SBarry Smith 222*9b92b1d3SBarry Smith### How come when I run the same linear solver on a different number of processes it takes a different number of iterations? 223*9b92b1d3SBarry Smith 224*9b92b1d3SBarry SmithThe convergence of many of the preconditioners in PETSc including the default parallel 225*9b92b1d3SBarry Smithpreconditioner block Jacobi depends on the number of processes. The more processes the 226*9b92b1d3SBarry Smith(slightly) slower convergence it has. This is the nature of iterative solvers, the more 227*9b92b1d3SBarry Smithparallelism means the more "older" information is used in the solution process hence 228*9b92b1d3SBarry Smithslower convergence. 229*9b92b1d3SBarry Smith 230*9b92b1d3SBarry Smith(doc_faq_gpuhowto)= 231*9b92b1d3SBarry Smith 232*9b92b1d3SBarry Smith### Can PETSc use GPUs to speed up computations? 233*9b92b1d3SBarry Smith 234*9b92b1d3SBarry Smith:::{seealso} 235*9b92b1d3SBarry SmithSee GPU development {ref}`roadmap <doc_gpu_roadmap>` for the latest information 236*9b92b1d3SBarry Smithregarding the state of PETSc GPU integration. 237*9b92b1d3SBarry Smith 238*9b92b1d3SBarry SmithSee GPU install {ref}`documentation <doc_config_accel>` for up-to-date information on 239*9b92b1d3SBarry Smithinstalling PETSc to use GPU's. 240*9b92b1d3SBarry Smith::: 241*9b92b1d3SBarry Smith 242*9b92b1d3SBarry SmithQuick summary of usage with CUDA: 243*9b92b1d3SBarry Smith 244*9b92b1d3SBarry Smith- The `VecType` `VECSEQCUDA`, `VECMPICUDA`, or `VECCUDA` may be used with 245*9b92b1d3SBarry Smith `VecSetType()` or `-vec_type seqcuda`, `mpicuda`, or `cuda` when 246*9b92b1d3SBarry Smith `VecSetFromOptions()` is used. 247*9b92b1d3SBarry Smith- The `MatType` `MATSEQAIJCUSPARSE`, `MATMPIAIJCUSPARSE`, or `MATAIJCUSPARSE` 248*9b92b1d3SBarry Smith may be used with `MatSetType()` or `-mat_type seqaijcusparse`, `mpiaijcusparse`, or 249*9b92b1d3SBarry Smith `aijcusparse` when `MatSetFromOptions()` is used. 250*9b92b1d3SBarry Smith- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type 251*9b92b1d3SBarry Smith cuda` and `-dm_mat_type aijcusparse`. 252*9b92b1d3SBarry Smith 253*9b92b1d3SBarry SmithQuick summary of usage with OpenCL (provided by the ViennaCL library): 254*9b92b1d3SBarry Smith 255*9b92b1d3SBarry Smith- The `VecType` `VECSEQVIENNACL`, `VECMPIVIENNACL`, or `VECVIENNACL` may be used 256*9b92b1d3SBarry Smith with `VecSetType()` or `-vec_type seqviennacl`, `mpiviennacl`, or `viennacl` 257*9b92b1d3SBarry Smith when `VecSetFromOptions()` is used. 258*9b92b1d3SBarry Smith- The `MatType` `MATSEQAIJVIENNACL`, `MATMPIAIJVIENNACL`, or `MATAIJVIENNACL` 259*9b92b1d3SBarry Smith may be used with `MatSetType()` or `-mat_type seqaijviennacl`, `mpiaijviennacl`, or 260*9b92b1d3SBarry Smith `aijviennacl` when `MatSetFromOptions()` is used. 261*9b92b1d3SBarry Smith- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type 262*9b92b1d3SBarry Smith viennacl` and `-dm_mat_type aijviennacl`. 263*9b92b1d3SBarry Smith 264*9b92b1d3SBarry SmithGeneral hints: 265*9b92b1d3SBarry Smith 266*9b92b1d3SBarry Smith- It is useful to develop your code with the default vectors and then run production runs 267*9b92b1d3SBarry Smith with the command line options to use the GPU since debugging on GPUs is difficult. 268*9b92b1d3SBarry Smith- All of the Krylov methods except `KSPIBCGS` run on the GPU. 269*9b92b1d3SBarry Smith- Parts of most preconditioners run directly on the GPU. After setup, `PCGAMG` runs 270*9b92b1d3SBarry Smith fully on GPUs, without any memory copies between the CPU and GPU. 271*9b92b1d3SBarry Smith 272*9b92b1d3SBarry SmithSome GPU systems (for example many laptops) only run with single precision; thus, PETSc 273*9b92b1d3SBarry Smithmust be built with the `configure` option `--with-precision=single`. 274*9b92b1d3SBarry Smith 275*9b92b1d3SBarry Smith(doc_faq_extendedprecision)= 276*9b92b1d3SBarry Smith 277*9b92b1d3SBarry Smith### Can I run PETSc with extended precision? 278*9b92b1d3SBarry Smith 279*9b92b1d3SBarry SmithYes, with gcc and gfortran. `configure` PETSc using the 280*9b92b1d3SBarry Smithoptions `--with-precision=__float128` and `--download-f2cblaslapack`. 281*9b92b1d3SBarry Smith 282*9b92b1d3SBarry Smith:::{admonition} Warning 283*9b92b1d3SBarry Smith:class: yellow 284*9b92b1d3SBarry Smith 285*9b92b1d3SBarry SmithExternal packages are not guaranteed to work in this mode! 286*9b92b1d3SBarry Smith::: 287*9b92b1d3SBarry Smith 288*9b92b1d3SBarry Smith### Why doesn't PETSc use Qd to implement support for extended precision? 289*9b92b1d3SBarry Smith 290*9b92b1d3SBarry SmithWe tried really hard but could not. The problem is that the QD c++ classes, though they 291*9b92b1d3SBarry Smithtry to, implement the built-in data types of `double` are not native types and cannot 292*9b92b1d3SBarry Smith"just be used" in a general piece of numerical source code. Ratherm the code has to 293*9b92b1d3SBarry Smithrewritten to live within the limitations of QD classes. However PETSc can be built to use 294*9b92b1d3SBarry Smithquad precision, as detailed {ref}`here <doc_faq_extendedprecision>`. 295*9b92b1d3SBarry Smith 296*9b92b1d3SBarry Smith### How do I cite PETSc? 297*9b92b1d3SBarry Smith 298*9b92b1d3SBarry SmithUse {any}`these citations <doc_index_citing_petsc>`. 299*9b92b1d3SBarry Smith 300*9b92b1d3SBarry Smith______________________________________________________________________ 301*9b92b1d3SBarry Smith 302*9b92b1d3SBarry Smith## Installation 303*9b92b1d3SBarry Smith 304*9b92b1d3SBarry Smith### How do I begin using PETSc if the software has already been completely built and installed by someone else? 305*9b92b1d3SBarry Smith 306*9b92b1d3SBarry SmithAssuming that the PETSc libraries have been successfully built for a particular 307*9b92b1d3SBarry Smitharchitecture and level of optimization, a new user must merely: 308*9b92b1d3SBarry Smith 309*9b92b1d3SBarry Smith1. Set `PETSC_DIR` to the full path of the PETSc home 310*9b92b1d3SBarry Smith directory. This will be the location of the `configure` script, and usually called 311*9b92b1d3SBarry Smith "petsc" or some variation of that (for example, /home/username/petsc). 312*9b92b1d3SBarry Smith2. Set `PETSC_ARCH`, which indicates the configuration on which PETSc will be 313*9b92b1d3SBarry Smith used. Note that `$PETSC_ARCH` is simply a name the installer used when installing 314*9b92b1d3SBarry Smith the libraries. There will exist a directory within `$PETSC_DIR` that is named after 315*9b92b1d3SBarry Smith its corresponding `$PETSC_ARCH`. There many be several on a single system, for 316*9b92b1d3SBarry Smith example "linux-c-debug" for the debug versions compiled by a C compiler or 317*9b92b1d3SBarry Smith "linux-c-opt" for the optimized version. 318*9b92b1d3SBarry Smith 319*9b92b1d3SBarry Smith:::{admonition} Still Stuck? 320*9b92b1d3SBarry SmithSee the {ref}`quick-start tutorial <tut_install>` for a step-by-step guide on 321*9b92b1d3SBarry Smithinstalling PETSc, in case you have missed a step. 322*9b92b1d3SBarry Smith 323*9b92b1d3SBarry SmithSee the users manual section on {ref}`getting started <sec-getting-started>`. 324*9b92b1d3SBarry Smith::: 325*9b92b1d3SBarry Smith 326*9b92b1d3SBarry Smith### The PETSc distribution is SO Large. How can I reduce my disk space usage? 327*9b92b1d3SBarry Smith 328*9b92b1d3SBarry Smith1. The PETSc users manual is provided in PDF format at `$PETSC_DIR/manual.pdf`. You 329*9b92b1d3SBarry Smith can delete this. 330*9b92b1d3SBarry Smith2. The PETSc test suite contains sample output for many of the examples. These are 331*9b92b1d3SBarry Smith contained in the PETSc directories `$PETSC_DIR/src/*/tutorials/output` and 332*9b92b1d3SBarry Smith `$PETSC_DIR/src/*/tests/output`. Once you have run the test examples, you may remove 333*9b92b1d3SBarry Smith all of these directories to save some disk space. You can locate the largest with 334*9b92b1d3SBarry Smith e.g. `find . -name output -type d | xargs du -sh | sort -hr` on a Unix-based system. 335*9b92b1d3SBarry Smith3. The debugging versions of the libraries are larger than the optimized versions. In a 336*9b92b1d3SBarry Smith pinch you can work with the optimized version, although we bid you good luck in 337*9b92b1d3SBarry Smith finding bugs as it is much easier with the debug version. 338*9b92b1d3SBarry Smith 339*9b92b1d3SBarry Smith### I want to use PETSc only for uniprocessor programs. Must I still install and use a version of MPI? 340*9b92b1d3SBarry Smith 341*9b92b1d3SBarry SmithNo, run `configure` with the option `--with-mpi=0` 342*9b92b1d3SBarry Smith 343*9b92b1d3SBarry Smith### Can I install PETSc to not use X windows (either under Unix or Microsoft Windows with GCC)? 344*9b92b1d3SBarry Smith 345*9b92b1d3SBarry SmithYes. Run `configure` with the additional flag `--with-x=0` 346*9b92b1d3SBarry Smith 347*9b92b1d3SBarry Smith### Why do you use MPI? 348*9b92b1d3SBarry Smith 349*9b92b1d3SBarry SmithMPI is the message-passing standard. Because it is a standard, it will not frequently change over 350*9b92b1d3SBarry Smithtime; thus, we do not have to change PETSc every time the provider of the message-passing 351*9b92b1d3SBarry Smithsystem decides to make an interface change. MPI was carefully designed by experts from 352*9b92b1d3SBarry Smithindustry, academia, and government labs to provide the highest quality performance and 353*9b92b1d3SBarry Smithcapability. 354*9b92b1d3SBarry Smith 355*9b92b1d3SBarry SmithFor example, the careful design of communicators in MPI allows the easy nesting of 356*9b92b1d3SBarry Smithdifferent libraries; no other message-passing system provides this support. All of the 357*9b92b1d3SBarry Smithmajor parallel computer vendors were involved in the design of MPI and have committed to 358*9b92b1d3SBarry Smithproviding quality implementations. 359*9b92b1d3SBarry Smith 360*9b92b1d3SBarry SmithIn addition, since MPI is a standard, several different groups have already provided 361*9b92b1d3SBarry Smithcomplete free implementations. Thus, one does not have to rely on the technical skills of 362*9b92b1d3SBarry Smithone particular group to provide the message-passing libraries. Today, MPI is the only 363*9b92b1d3SBarry Smithpractical, portable approach to writing efficient parallel numerical software. 364*9b92b1d3SBarry Smith 365*9b92b1d3SBarry Smith(invalid_mpi_compilers)= 366*9b92b1d3SBarry Smith 367*9b92b1d3SBarry Smith### What do I do if my MPI compiler wrappers are invalid? 368*9b92b1d3SBarry Smith 369*9b92b1d3SBarry SmithMost MPI implementations provide compiler wrappers (such as `mpicc`) which give the 370*9b92b1d3SBarry Smithinclude and link options necessary to use that version of MPI to the underlying compilers 371*9b92b1d3SBarry Smith. Configuration will fail if these wrappers are either absent or broken in the MPI pointed to by 372*9b92b1d3SBarry Smith`--with-mpi-dir`. You can rerun the configure with the additional option 373*9b92b1d3SBarry Smith`--with-mpi-compilers=0`, which will try to auto-detect working compilers; however, 374*9b92b1d3SBarry Smiththese compilers may be incompatible with the particular MPI build. If this fix does not 375*9b92b1d3SBarry Smithwork, run with `--with-cc=[your_c_compiler]` where you know `your_c_compiler` works 376*9b92b1d3SBarry Smithwith this particular MPI, and likewise for C++ (`--with-cxx=[your_cxx_compiler]`) and Fortran 377*9b92b1d3SBarry Smith(`--with-fc=[your_fortran_compiler]`). 378*9b92b1d3SBarry Smith 379*9b92b1d3SBarry Smith(bit_indices)= 380*9b92b1d3SBarry Smith 381*9b92b1d3SBarry Smith### When should/can I use the `configure` option `--with-64-bit-indices`? 382*9b92b1d3SBarry Smith 383*9b92b1d3SBarry SmithBy default the type that PETSc uses to index into arrays and keep sizes of arrays is a 384*9b92b1d3SBarry Smith`PetscInt` defined to be a 32-bit `int`. If your problem: 385*9b92b1d3SBarry Smith 386*9b92b1d3SBarry Smith- Involves more than 2^31 - 1 unknowns (around 2 billion). 387*9b92b1d3SBarry Smith- Your matrix might contain more than 2^31 - 1 nonzeros on a single process. 388*9b92b1d3SBarry Smith 389*9b92b1d3SBarry SmithThen you need to use this option. Otherwise you will get strange crashes. 390*9b92b1d3SBarry Smith 391*9b92b1d3SBarry SmithThis option can be used when you are using either 32 or 64-bit pointers. You do not 392*9b92b1d3SBarry Smithneed to use this option if you are using 64-bit pointers unless the two conditions above 393*9b92b1d3SBarry Smithhold. 394*9b92b1d3SBarry Smith 395*9b92b1d3SBarry Smith### What if I get an internal compiler error? 396*9b92b1d3SBarry Smith 397*9b92b1d3SBarry SmithYou can rebuild the offending file individually with a lower optimization level. **Then 398*9b92b1d3SBarry Smithmake sure to complain to the compiler vendor and file a bug report**. For example, if the 399*9b92b1d3SBarry Smithcompiler chokes on `src/mat/impls/baij/seq/baijsolvtrannat.c` you can run the following 400*9b92b1d3SBarry Smithfrom `$PETSC_DIR`: 401*9b92b1d3SBarry Smith 402*9b92b1d3SBarry Smith```console 403*9b92b1d3SBarry Smith$ make -f gmakefile PCC_FLAGS="-O1" $PETSC_ARCH/obj/src/mat/impls/baij/seq/baijsolvtrannat.o 404*9b92b1d3SBarry Smith$ make all 405*9b92b1d3SBarry Smith``` 406*9b92b1d3SBarry Smith 407*9b92b1d3SBarry Smith### How do I enable Python bindings (petsc4py) with PETSc? 408*9b92b1d3SBarry Smith 409*9b92b1d3SBarry Smith1. Install [Cython](https://cython.org/). 410*9b92b1d3SBarry Smith2. `configure` PETSc with the `--with-petsc4py=1` option. 411*9b92b1d3SBarry Smith3. set `PYTHONPATH=$PETSC_DIR/$PETSC_ARCH/lib` 412*9b92b1d3SBarry Smith 413*9b92b1d3SBarry Smith(macos_gfortran)= 414*9b92b1d3SBarry Smith 415*9b92b1d3SBarry Smith### What Fortran compiler do you recommend on macOS? 416*9b92b1d3SBarry Smith 417*9b92b1d3SBarry SmithWe recommend using [homebrew](https://brew.sh/) to install [gfortran](https://gcc.gnu.org/wiki/GFortran), see {any}`doc_macos_install` 418*9b92b1d3SBarry Smith 419*9b92b1d3SBarry Smith### How can I find the URL locations of the packages you install using `--download-PACKAGE`? 420*9b92b1d3SBarry Smith 421*9b92b1d3SBarry Smith```console 422*9b92b1d3SBarry Smith$ grep "self.download " $PETSC_DIR/config/BuildSystem/config/packages/*.py 423*9b92b1d3SBarry Smith``` 424*9b92b1d3SBarry Smith 425*9b92b1d3SBarry Smith### How to fix the problem: PETSc was configured with one MPICH (or Open MPI) `mpi.h` version but now appears to be compiling using a different MPICH (or Open MPI) `mpi.h` version 426*9b92b1d3SBarry Smith 427*9b92b1d3SBarry SmithThis happens for generally one of two reasons: 428*9b92b1d3SBarry Smith 429*9b92b1d3SBarry Smith- You previously ran `configure` with the option `--download-mpich` (or `--download-openmpi`) 430*9b92b1d3SBarry Smith but later ran `configure` to use a version of MPI already installed on the 431*9b92b1d3SBarry Smith machine. Solution: 432*9b92b1d3SBarry Smith 433*9b92b1d3SBarry Smith ```console 434*9b92b1d3SBarry Smith $ rm -rf $PETSC_DIR/$PETSC_ARCH 435*9b92b1d3SBarry Smith $ ./configure --your-args 436*9b92b1d3SBarry Smith ``` 437*9b92b1d3SBarry Smith 438*9b92b1d3SBarry Smith(mpi_network_misconfigure)= 439*9b92b1d3SBarry Smith 440*9b92b1d3SBarry Smith### What does it mean when `make check` hangs or errors on `PetscOptionsInsertFile()`? 441*9b92b1d3SBarry Smith 442*9b92b1d3SBarry SmithFor example: 443*9b92b1d3SBarry Smith 444*9b92b1d3SBarry Smith```none 445*9b92b1d3SBarry SmithPossible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes 446*9b92b1d3SBarry SmithSee https://petsc.org/release/faq/ 447*9b92b1d3SBarry Smith[0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c 448*9b92b1d3SBarry Smith[0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c 449*9b92b1d3SBarry Smith[0]PETSC ERROR: #3 PetscInitialize() line 828 in /Users/barrysmith/Src/PETSc/src/sys/objects/pinit.c 450*9b92b1d3SBarry Smith``` 451*9b92b1d3SBarry Smith 452*9b92b1d3SBarry Smithor 453*9b92b1d3SBarry Smith 454*9b92b1d3SBarry Smith```none 455*9b92b1d3SBarry Smith$ make check 456*9b92b1d3SBarry SmithRunning check examples to verify correct installation 457*9b92b1d3SBarry SmithUsing PETSC_DIR=/Users/barrysmith/Src/petsc and PETSC_ARCH=arch-fix-mpiexec-hang-2-ranks 458*9b92b1d3SBarry SmithC/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process 459*9b92b1d3SBarry SmithPROGRAM SEEMS TO BE HANGING HERE 460*9b92b1d3SBarry Smith``` 461*9b92b1d3SBarry Smith 462*9b92b1d3SBarry SmithThis usually occurs when network settings are misconfigured (perhaps due to VPN) resulting in a failure or hang in system call `gethostbyname()`. 463*9b92b1d3SBarry Smith 464*9b92b1d3SBarry Smith- Verify you are using the correct `mpiexec` for the MPI you have linked PETSc with. 465*9b92b1d3SBarry Smith 466*9b92b1d3SBarry Smith- If you have a VPN enabled on your machine, try turning it off and then running `make check` to 467*9b92b1d3SBarry Smith verify that it is not the VPN playing poorly with MPI. 468*9b92b1d3SBarry Smith 469*9b92b1d3SBarry Smith- If ``` ping `hostname` `` ( ```/sbin/ping\`\` on macOS) fails or hangs do: 470*9b92b1d3SBarry Smith 471*9b92b1d3SBarry Smith ```none 472*9b92b1d3SBarry Smith echo 127.0.0.1 `hostname` | sudo tee -a /etc/hosts 473*9b92b1d3SBarry Smith ``` 474*9b92b1d3SBarry Smith 475*9b92b1d3SBarry Smith and try `make check` again. 476*9b92b1d3SBarry Smith 477*9b92b1d3SBarry Smith- Try completely disconnecting your machine from the network and see if `make check` then works 478*9b92b1d3SBarry Smith 479*9b92b1d3SBarry Smith- Try the PETSc `configure` option `--download-mpich-device=ch3:nemesis` with `--download-mpich`. 480*9b92b1d3SBarry Smith 481*9b92b1d3SBarry Smith______________________________________________________________________ 482*9b92b1d3SBarry Smith 483*9b92b1d3SBarry Smith## Usage 484*9b92b1d3SBarry Smith 485*9b92b1d3SBarry Smith### How can I redirect PETSc's `stdout` and `stderr` when programming with a GUI interface in Windows Developer Studio or to C++ streams? 486*9b92b1d3SBarry Smith 487*9b92b1d3SBarry SmithTo overload just the error messages write your own `MyPrintError()` function that does 488*9b92b1d3SBarry Smithwhatever you want (including pop up windows etc) and use it like below. 489*9b92b1d3SBarry Smith 490*9b92b1d3SBarry Smith```c 491*9b92b1d3SBarry Smithextern "C" { 492*9b92b1d3SBarry Smith int PASCAL WinMain(HINSTANCE,HINSTANCE,LPSTR,int); 493*9b92b1d3SBarry Smith}; 494*9b92b1d3SBarry Smith 495*9b92b1d3SBarry Smith#include <petscsys.h> 496*9b92b1d3SBarry Smith#include <mpi.h> 497*9b92b1d3SBarry Smith 498*9b92b1d3SBarry Smithconst char help[] = "Set up from main"; 499*9b92b1d3SBarry Smith 500*9b92b1d3SBarry Smithint MyPrintError(const char error[], ...) 501*9b92b1d3SBarry Smith{ 502*9b92b1d3SBarry Smith printf("%s", error); 503*9b92b1d3SBarry Smith return 0; 504*9b92b1d3SBarry Smith} 505*9b92b1d3SBarry Smith 506*9b92b1d3SBarry Smithint main(int ac, char *av[]) 507*9b92b1d3SBarry Smith{ 508*9b92b1d3SBarry Smith char buf[256]; 509*9b92b1d3SBarry Smith HINSTANCE inst; 510*9b92b1d3SBarry Smith PetscErrorCode ierr; 511*9b92b1d3SBarry Smith 512*9b92b1d3SBarry Smith inst = (HINSTANCE)GetModuleHandle(NULL); 513*9b92b1d3SBarry Smith PetscErrorPrintf = MyPrintError; 514*9b92b1d3SBarry Smith 515*9b92b1d3SBarry Smith buf[0] = 0; 516*9b92b1d3SBarry Smith for (int i = 1; i < ac; ++i) { 517*9b92b1d3SBarry Smith strcat(buf, av[i]); 518*9b92b1d3SBarry Smith strcat(buf, " "); 519*9b92b1d3SBarry Smith } 520*9b92b1d3SBarry Smith 521*9b92b1d3SBarry Smith PetscCall(PetscInitialize(&ac, &av, NULL, help)); 522*9b92b1d3SBarry Smith 523*9b92b1d3SBarry Smith return WinMain(inst, NULL, buf, SW_SHOWNORMAL); 524*9b92b1d3SBarry Smith} 525*9b92b1d3SBarry Smith``` 526*9b92b1d3SBarry Smith 527*9b92b1d3SBarry SmithPlace this file in the project and compile with this preprocessor definitions: 528*9b92b1d3SBarry Smith 529*9b92b1d3SBarry Smith``` 530*9b92b1d3SBarry SmithWIN32 531*9b92b1d3SBarry Smith_DEBUG 532*9b92b1d3SBarry Smith_CONSOLE 533*9b92b1d3SBarry Smith_MBCS 534*9b92b1d3SBarry SmithUSE_PETSC_LOG 535*9b92b1d3SBarry SmithUSE_PETSC_BOPT_g 536*9b92b1d3SBarry SmithUSE_PETSC_STACK 537*9b92b1d3SBarry Smith_AFXDLL 538*9b92b1d3SBarry Smith``` 539*9b92b1d3SBarry Smith 540*9b92b1d3SBarry SmithAnd these link options: 541*9b92b1d3SBarry Smith 542*9b92b1d3SBarry Smith``` 543*9b92b1d3SBarry Smith/nologo 544*9b92b1d3SBarry Smith/subsystem:console 545*9b92b1d3SBarry Smith/incremental:yes 546*9b92b1d3SBarry Smith/debug 547*9b92b1d3SBarry Smith/machine:I386 548*9b92b1d3SBarry Smith/nodefaultlib:"libcmtd.lib" 549*9b92b1d3SBarry Smith/nodefaultlib:"libcd.lib" 550*9b92b1d3SBarry Smith/nodefaultlib:"mvcrt.lib" 551*9b92b1d3SBarry Smith/pdbtype:sept 552*9b92b1d3SBarry Smith``` 553*9b92b1d3SBarry Smith 554*9b92b1d3SBarry Smith:::{note} 555*9b92b1d3SBarry SmithThe above is compiled and linked as if it was a console program. The linker will search 556*9b92b1d3SBarry Smithfor a main, and then from it the `WinMain` will start. This works with MFC templates and 557*9b92b1d3SBarry Smithderived classes too. 558*9b92b1d3SBarry Smith 559*9b92b1d3SBarry SmithWhen writing a Window's console application you do not need to do anything, the `stdout` 560*9b92b1d3SBarry Smithand `stderr` is automatically output to the console window. 561*9b92b1d3SBarry Smith::: 562*9b92b1d3SBarry Smith 563*9b92b1d3SBarry SmithTo change where all PETSc `stdout` and `stderr` go, (you can also reassign 564*9b92b1d3SBarry Smith`PetscVFPrintf()` to handle `stdout` and `stderr` any way you like) write the 565*9b92b1d3SBarry Smithfollowing function: 566*9b92b1d3SBarry Smith 567*9b92b1d3SBarry Smith``` 568*9b92b1d3SBarry SmithPetscErrorCode mypetscvfprintf(FILE *fd, const char format[], va_list Argp) 569*9b92b1d3SBarry Smith{ 570*9b92b1d3SBarry Smith PetscFunctionBegin; 571*9b92b1d3SBarry Smith if (fd != stdout && fd != stderr) { /* handle regular files */ 572*9b92b1d3SBarry Smith PetscCall(PetscVFPrintfDefault(fd, format, Argp)); 573*9b92b1d3SBarry Smith } else { 574*9b92b1d3SBarry Smith char buff[1024]; /* Make sure to assign a large enough buffer */ 575*9b92b1d3SBarry Smith int length; 576*9b92b1d3SBarry Smith 577*9b92b1d3SBarry Smith PetscCall(PetscVSNPrintf(buff, 1024, format, &length, Argp)); 578*9b92b1d3SBarry Smith 579*9b92b1d3SBarry Smith /* now send buff to whatever stream or whatever you want */ 580*9b92b1d3SBarry Smith } 581*9b92b1d3SBarry Smith PetscFunctionReturn(PETSC_SUCCESS); 582*9b92b1d3SBarry Smith} 583*9b92b1d3SBarry Smith``` 584*9b92b1d3SBarry Smith 585*9b92b1d3SBarry SmithThen assign `PetscVFPrintf = mypetscprintf` before `PetscInitialize()` in your main 586*9b92b1d3SBarry Smithprogram. 587*9b92b1d3SBarry Smith 588*9b92b1d3SBarry Smith### I want to use Hypre boomerAMG without GMRES but when I run `-pc_type hypre -pc_hypre_type boomeramg -ksp_type preonly` I don't get a very accurate answer! 589*9b92b1d3SBarry Smith 590*9b92b1d3SBarry SmithYou should run with `-ksp_type richardson` to have PETSc run several V or W 591*9b92b1d3SBarry Smithcycles. `-ksp_type preonly` causes boomerAMG to use only one V/W cycle. You can control 592*9b92b1d3SBarry Smithhow many cycles are used in a single application of the boomerAMG preconditioner with 593*9b92b1d3SBarry Smith`-pc_hypre_boomeramg_max_iter <it>` (the default is 1). You can also control the 594*9b92b1d3SBarry Smithtolerance boomerAMG uses to decide if to stop before `max_iter` with 595*9b92b1d3SBarry Smith`-pc_hypre_boomeramg_tol <tol>` (the default is 1.e-7). Run with `-ksp_view` to see 596*9b92b1d3SBarry Smithall the hypre options used and `-help | grep boomeramg` to see all the command line 597*9b92b1d3SBarry Smithoptions. 598*9b92b1d3SBarry Smith 599*9b92b1d3SBarry Smith### How do I use PETSc for Domain Decomposition? 600*9b92b1d3SBarry Smith 601*9b92b1d3SBarry SmithPETSc includes Additive Schwarz methods in the suite of preconditioners under the umbrella 602*9b92b1d3SBarry Smithof `PCASM`. These may be activated with the runtime option `-pc_type asm`. Various 603*9b92b1d3SBarry Smithother options may be set, including the degree of overlap `-pc_asm_overlap <number>` the 604*9b92b1d3SBarry Smithtype of restriction/extension `-pc_asm_type [basic,restrict,interpolate,none]` sets ASM 605*9b92b1d3SBarry Smithtype and several others. You may see the available ASM options by using `-pc_type asm 606*9b92b1d3SBarry Smith-help`. See the procedural interfaces in the manual pages, for example `PCASMType()` 607*9b92b1d3SBarry Smithand check the index of the users manual for `PCASMCreateSubdomains()`. 608*9b92b1d3SBarry Smith 609*9b92b1d3SBarry SmithPETSc also contains a domain decomposition inspired wirebasket or face based two level 610*9b92b1d3SBarry Smithmethod where the coarse mesh to fine mesh interpolation is defined by solving specific 611*9b92b1d3SBarry Smithlocal subdomain problems. It currently only works for 3D scalar problems on structured 612*9b92b1d3SBarry Smithgrids created with PETSc `DMDA`. See the manual page for `PCEXOTIC` and 613*9b92b1d3SBarry Smith`src/ksp/ksp/tutorials/ex45.c` for an example. 614*9b92b1d3SBarry Smith 615*9b92b1d3SBarry SmithPETSc also contains a balancing Neumann-Neumann type preconditioner, see the manual page 616*9b92b1d3SBarry Smithfor `PCBDDC`. This requires matrices be constructed with `MatCreateIS()` via the finite 617*9b92b1d3SBarry Smithelement method. See `src/ksp/ksp/tests/ex59.c` for an example. 618*9b92b1d3SBarry Smith 619*9b92b1d3SBarry Smith### You have `MATAIJ` and `MATBAIJ` matrix formats, and `MATSBAIJ` for symmetric storage, how come no `MATSAIJ`? 620*9b92b1d3SBarry Smith 621*9b92b1d3SBarry SmithJust for historical reasons; the `MATSBAIJ` format with blocksize one is just as efficient as 622*9b92b1d3SBarry Smitha `MATSAIJ` would be. 623*9b92b1d3SBarry Smith 624*9b92b1d3SBarry Smith### Can I Create BAIJ matrices with different size blocks for different block rows? 625*9b92b1d3SBarry Smith 626*9b92b1d3SBarry SmithNo. The `MATBAIJ` format only supports a single fixed block size on the entire 627*9b92b1d3SBarry Smithmatrix. But the `MATAIJ` format automatically searches for matching rows and thus still 628*9b92b1d3SBarry Smithtakes advantage of the natural blocks in your matrix to obtain good performance. 629*9b92b1d3SBarry Smith 630*9b92b1d3SBarry Smith:::{note} 631*9b92b1d3SBarry SmithIf you use `MATAIJ` you cannot use the `MatSetValuesBlocked()`. 632*9b92b1d3SBarry Smith::: 633*9b92b1d3SBarry Smith 634*9b92b1d3SBarry Smith### How do I access the values of a remote parallel PETSc Vec? 635*9b92b1d3SBarry Smith 636*9b92b1d3SBarry Smith1. On each process create a local `Vec` large enough to hold all the values it wishes to 637*9b92b1d3SBarry Smith access. 638*9b92b1d3SBarry Smith2. Create a `VecScatter` that scatters from the parallel `Vec` into the local `Vec`. 639*9b92b1d3SBarry Smith3. Use `VecGetArray()` to access the values in the local `Vec`. 640*9b92b1d3SBarry Smith 641*9b92b1d3SBarry SmithFor example, assuming we have distributed a vector `vecGlobal` of size $N$ to 642*9b92b1d3SBarry Smith$R$ ranks and each remote rank holds $N/R = m$ values (similarly assume that 643*9b92b1d3SBarry Smith$N$ is cleanly divisible by $R$). We want each rank $r$ to gather the 644*9b92b1d3SBarry Smithfirst $n$ (also assume $n \leq m$) values from its immediately superior neighbor 645*9b92b1d3SBarry Smith$r+1$ (final rank will retrieve from rank 0). 646*9b92b1d3SBarry Smith 647*9b92b1d3SBarry Smith``` 648*9b92b1d3SBarry SmithVec vecLocal; 649*9b92b1d3SBarry SmithIS isLocal, isGlobal; 650*9b92b1d3SBarry SmithVecScatter ctx; 651*9b92b1d3SBarry SmithPetscScalar *arr; 652*9b92b1d3SBarry SmithPetscInt N, firstGlobalIndex; 653*9b92b1d3SBarry SmithMPI_Comm comm; 654*9b92b1d3SBarry SmithPetscMPIInt r, R; 655*9b92b1d3SBarry Smith 656*9b92b1d3SBarry Smith/* Create sequential local vector, big enough to hold local portion */ 657*9b92b1d3SBarry SmithPetscCall(VecCreateSeq(PETSC_COMM_SELF, n, &vecLocal)); 658*9b92b1d3SBarry Smith 659*9b92b1d3SBarry Smith/* Create IS to describe where we want to scatter to */ 660*9b92b1d3SBarry SmithPetscCall(ISCreateStride(PETSC_COMM_SELF, n, 0, 1, &isLocal)); 661*9b92b1d3SBarry Smith 662*9b92b1d3SBarry Smith/* Compute the global indices */ 663*9b92b1d3SBarry SmithPetscCall(VecGetSize(vecGlobal, &N)); 664*9b92b1d3SBarry SmithPetscCall(PetscObjectGetComm((PetscObject) vecGlobal, &comm)); 665*9b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_rank(comm, &r)); 666*9b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_size(comm, &R)); 667*9b92b1d3SBarry SmithfirstGlobalIndex = r == R-1 ? 0 : (N/R)*(r+1); 668*9b92b1d3SBarry Smith 669*9b92b1d3SBarry Smith/* Create IS that describes where we want to scatter from */ 670*9b92b1d3SBarry SmithPetscCall(ISCreateStride(comm, n, firstGlobalIndex, 1, &isGlobal)); 671*9b92b1d3SBarry Smith 672*9b92b1d3SBarry Smith/* Create the VecScatter context */ 673*9b92b1d3SBarry SmithPetscCall(VecScatterCreate(vecGlobal, isGlobal, vecLocal, isLocal, &ctx)); 674*9b92b1d3SBarry Smith 675*9b92b1d3SBarry Smith/* Gather the values */ 676*9b92b1d3SBarry SmithPetscCall(VecScatterBegin(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD)); 677*9b92b1d3SBarry SmithPetscCall(VecScatterEnd(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD)); 678*9b92b1d3SBarry Smith 679*9b92b1d3SBarry Smith/* Retrieve and do work */ 680*9b92b1d3SBarry SmithPetscCall(VecGetArray(vecLocal, &arr)); 681*9b92b1d3SBarry Smith/* Work */ 682*9b92b1d3SBarry SmithPetscCall(VecRestoreArray(vecLocal, &arr)); 683*9b92b1d3SBarry Smith 684*9b92b1d3SBarry Smith/* Don't forget to clean up */ 685*9b92b1d3SBarry SmithPetscCall(ISDestroy(&isLocal)); 686*9b92b1d3SBarry SmithPetscCall(ISDestroy(&isGlobal)); 687*9b92b1d3SBarry SmithPetscCall(VecScatterDestroy(&ctx)); 688*9b92b1d3SBarry SmithPetscCall(VecDestroy(&vecLocal)); 689*9b92b1d3SBarry Smith``` 690*9b92b1d3SBarry Smith 691*9b92b1d3SBarry Smith(doc_faq_usage_alltoone)= 692*9b92b1d3SBarry Smith 693*9b92b1d3SBarry Smith### How do I collect to a single processor all the values from a parallel PETSc Vec? 694*9b92b1d3SBarry Smith 695*9b92b1d3SBarry Smith1. Create the `VecScatter` context that will do the communication: 696*9b92b1d3SBarry Smith 697*9b92b1d3SBarry Smith ``` 698*9b92b1d3SBarry Smith Vec in_par,out_seq; 699*9b92b1d3SBarry Smith VecScatter ctx; 700*9b92b1d3SBarry Smith 701*9b92b1d3SBarry Smith PetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq)); 702*9b92b1d3SBarry Smith ``` 703*9b92b1d3SBarry Smith 704*9b92b1d3SBarry Smith2. Initiate the communication (this may be repeated if you wish): 705*9b92b1d3SBarry Smith 706*9b92b1d3SBarry Smith ``` 707*9b92b1d3SBarry Smith PetscCall(VecScatterBegin(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD)); 708*9b92b1d3SBarry Smith PetscCall(VecScatterEnd(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD)); 709*9b92b1d3SBarry Smith /* May destroy context now if no additional scatters are needed, otherwise reuse it */ 710*9b92b1d3SBarry Smith PetscCall(VecScatterDestroy(&ctx)); 711*9b92b1d3SBarry Smith ``` 712*9b92b1d3SBarry Smith 713*9b92b1d3SBarry SmithNote that this simply concatenates in the parallel ordering of the vector (computed by the 714*9b92b1d3SBarry Smith`MPI_Comm_rank` of the owning process). If you are using a `Vec` from 715*9b92b1d3SBarry Smith`DMCreateGlobalVector()` you likely want to first call `DMDAGlobalToNaturalBegin()` 716*9b92b1d3SBarry Smithfollowed by `DMDAGlobalToNaturalEnd()` to scatter the original `Vec` into the natural 717*9b92b1d3SBarry Smithordering in a new global `Vec` before calling `VecScatterBegin()`/`VecScatterEnd()` 718*9b92b1d3SBarry Smithto scatter the natural `Vec` onto all processes. 719*9b92b1d3SBarry Smith 720*9b92b1d3SBarry Smith### How do I collect all the values from a parallel PETSc Vec on the 0th rank? 721*9b92b1d3SBarry Smith 722*9b92b1d3SBarry SmithSee FAQ entry on collecting to {ref}`an arbitrary processor <doc_faq_usage_alltoone>`, but 723*9b92b1d3SBarry Smithreplace 724*9b92b1d3SBarry Smith 725*9b92b1d3SBarry Smith``` 726*9b92b1d3SBarry SmithPetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq)); 727*9b92b1d3SBarry Smith``` 728*9b92b1d3SBarry Smith 729*9b92b1d3SBarry Smithwith 730*9b92b1d3SBarry Smith 731*9b92b1d3SBarry Smith``` 732*9b92b1d3SBarry SmithPetscCall(VecScatterCreateToZero(in_par, &ctx, &out_seq)); 733*9b92b1d3SBarry Smith``` 734*9b92b1d3SBarry Smith 735*9b92b1d3SBarry Smith:::{note} 736*9b92b1d3SBarry SmithThe same ordering considerations as discussed in the aforementioned entry also apply 737*9b92b1d3SBarry Smithhere. 738*9b92b1d3SBarry Smith::: 739*9b92b1d3SBarry Smith 740*9b92b1d3SBarry Smith### How can I read in or write out a sparse matrix in Matrix Market, Harwell-Boeing, Slapc or other ASCII format? 741*9b92b1d3SBarry Smith 742*9b92b1d3SBarry SmithIf you can read or write your matrix using Python or MATLAB/Octave, `PetscBinaryIO` 743*9b92b1d3SBarry Smithmodules are provided at `$PETSC_DIR/lib/petsc/bin` for each language that can assist 744*9b92b1d3SBarry Smithwith reading and writing. If you just want to convert `MatrixMarket`, you can use: 745*9b92b1d3SBarry Smith 746*9b92b1d3SBarry Smith```console 747*9b92b1d3SBarry Smith$ python -m $PETSC_DIR/lib/petsc/bin/PetscBinaryIO convert matrix.mtx 748*9b92b1d3SBarry Smith``` 749*9b92b1d3SBarry Smith 750*9b92b1d3SBarry SmithTo produce `matrix.petsc`. 751*9b92b1d3SBarry Smith 752*9b92b1d3SBarry SmithYou can also call the script directly or import it from your Python code. There is also a 753*9b92b1d3SBarry Smith`PETScBinaryIO.jl` Julia package. 754*9b92b1d3SBarry Smith 755*9b92b1d3SBarry SmithFor other formats, either adapt one of the above libraries or see the examples in 756*9b92b1d3SBarry Smith`$PETSC_DIR/src/mat/tests`, specifically `ex72.c` or `ex78.c`. You will likely need 757*9b92b1d3SBarry Smithto modify the code slightly to match your required ASCII format. 758*9b92b1d3SBarry Smith 759*9b92b1d3SBarry Smith:::{note} 760*9b92b1d3SBarry SmithNever read or write in parallel an ASCII matrix file. 761*9b92b1d3SBarry Smith 762*9b92b1d3SBarry SmithInstead read in sequentially with a standalone code based on `ex72.c` or `ex78.c` 763*9b92b1d3SBarry Smiththen save the matrix with the binary viewer `PetscViewerBinaryOpen()` and load the 764*9b92b1d3SBarry Smithmatrix in parallel in your "real" PETSc program with `MatLoad()`. 765*9b92b1d3SBarry Smith 766*9b92b1d3SBarry SmithFor writing save with the binary viewer and then load with the sequential code to store 767*9b92b1d3SBarry Smithit as ASCII. 768*9b92b1d3SBarry Smith::: 769*9b92b1d3SBarry Smith 770*9b92b1d3SBarry Smith### Does TSSetFromOptions(), SNESSetFromOptions(), or KSPSetFromOptions() reset all the parameters I previously set or how come do they not seem to work? 771*9b92b1d3SBarry Smith 772*9b92b1d3SBarry SmithIf `XXSetFromOptions()` is used (with `-xxx_type aaaa`) to change the type of the 773*9b92b1d3SBarry Smithobject then all parameters associated with the previous type are removed. Otherwise it 774*9b92b1d3SBarry Smithdoes not reset parameters. 775*9b92b1d3SBarry Smith 776*9b92b1d3SBarry Smith`TS/SNES/KSPSetXXX()` commands that set properties for a particular type of object (such 777*9b92b1d3SBarry Smithas `KSPGMRESSetRestart()`) ONLY work if the object is ALREADY of that type. For example, 778*9b92b1d3SBarry Smithwith 779*9b92b1d3SBarry Smith 780*9b92b1d3SBarry Smith``` 781*9b92b1d3SBarry SmithKSP ksp; 782*9b92b1d3SBarry Smith 783*9b92b1d3SBarry SmithPetscCall(KSPCreate(PETSC_COMM_WORLD, &ksp)); 784*9b92b1d3SBarry SmithPetscCall(KSPGMRESSetRestart(ksp, 10)); 785*9b92b1d3SBarry Smith``` 786*9b92b1d3SBarry Smith 787*9b92b1d3SBarry Smiththe restart will be ignored since the type has not yet been set to `KSPGMRES`. To have 788*9b92b1d3SBarry Smiththose values take effect you should do one of the following: 789*9b92b1d3SBarry Smith 790*9b92b1d3SBarry Smith- Allow setting the type from the command line, if it is not on the command line then the 791*9b92b1d3SBarry Smith default type is automatically set. 792*9b92b1d3SBarry Smith 793*9b92b1d3SBarry Smith``` 794*9b92b1d3SBarry Smith/* Create generic object */ 795*9b92b1d3SBarry SmithXXXCreate(..,&obj); 796*9b92b1d3SBarry Smith/* Must set all settings here, or default */ 797*9b92b1d3SBarry SmithXXXSetFromOptions(obj); 798*9b92b1d3SBarry Smith``` 799*9b92b1d3SBarry Smith 800*9b92b1d3SBarry Smith- Hardwire the type in the code, but allow the user to override it via a subsequent 801*9b92b1d3SBarry Smith `XXXSetFromOptions()` call. This essentially allows the user to customize what the 802*9b92b1d3SBarry Smith "default" type to of the object. 803*9b92b1d3SBarry Smith 804*9b92b1d3SBarry Smith``` 805*9b92b1d3SBarry Smith/* Create generic object */ 806*9b92b1d3SBarry SmithXXXCreate(..,&obj); 807*9b92b1d3SBarry Smith/* Set type directly */ 808*9b92b1d3SBarry SmithXXXSetYYYYY(obj,...); 809*9b92b1d3SBarry Smith/* Can always change to different type */ 810*9b92b1d3SBarry SmithXXXSetFromOptions(obj); 811*9b92b1d3SBarry Smith``` 812*9b92b1d3SBarry Smith 813*9b92b1d3SBarry Smith### How do I compile and link my own PETSc application codes and can I use my own `makefile` or rules for compiling code, rather than PETSc's? 814*9b92b1d3SBarry Smith 815*9b92b1d3SBarry SmithSee the {ref}`section <sec_writing_application_codes>` of the users manual on writing 816*9b92b1d3SBarry Smithapplication codes with PETSc. 817*9b92b1d3SBarry Smith 818*9b92b1d3SBarry Smith### Can I use CMake to build my own project that depends on PETSc? 819*9b92b1d3SBarry Smith 820*9b92b1d3SBarry SmithSee the {ref}`section <sec_writing_application_codes>` of the users manual on writing 821*9b92b1d3SBarry Smithapplication codes with PETSc. 822*9b92b1d3SBarry Smith 823*9b92b1d3SBarry Smith### How can I put carriage returns in `PetscPrintf()` statements from Fortran? 824*9b92b1d3SBarry Smith 825*9b92b1d3SBarry SmithYou can use the same notation as in C, just put a `\n` in the string. Note that no other C 826*9b92b1d3SBarry Smithformat instruction is supported. 827*9b92b1d3SBarry Smith 828*9b92b1d3SBarry SmithOr you can use the Fortran concatenation `//` and `char(10)`; for example `'some 829*9b92b1d3SBarry Smithstring'//char(10)//'another string` on the next line. 830*9b92b1d3SBarry Smith 831*9b92b1d3SBarry Smith### How can I implement callbacks using C++ class methods? 832*9b92b1d3SBarry Smith 833*9b92b1d3SBarry SmithDeclare the class method static. Static methods do not have a `this` pointer, but the 834*9b92b1d3SBarry Smith`void*` context parameter will usually be cast to a pointer to the class where it can 835*9b92b1d3SBarry Smithserve the same function. 836*9b92b1d3SBarry Smith 837*9b92b1d3SBarry Smith:::{admonition} Remember 838*9b92b1d3SBarry SmithAll PETSc callbacks return `PetscErrorCode`. 839*9b92b1d3SBarry Smith::: 840*9b92b1d3SBarry Smith 841*9b92b1d3SBarry Smith### Everyone knows that when you code Newton's Method you should compute the function and its Jacobian at the same time. How can one do this in PETSc? 842*9b92b1d3SBarry Smith 843*9b92b1d3SBarry SmithThe update in Newton's method is computed as 844*9b92b1d3SBarry Smith 845*9b92b1d3SBarry Smith$$ 846*9b92b1d3SBarry Smithu^{n+1} = u^n - \lambda * \left[J(u^n)] * F(u^n) \right]^{\dagger} 847*9b92b1d3SBarry Smith$$ 848*9b92b1d3SBarry Smith 849*9b92b1d3SBarry SmithThe reason PETSc doesn't default to computing both the function and Jacobian at the same 850*9b92b1d3SBarry Smithtime is: 851*9b92b1d3SBarry Smith 852*9b92b1d3SBarry Smith- In order to do the line search $F \left(u^n - \lambda * \text{step} \right)$ may 853*9b92b1d3SBarry Smith need to be computed for several $\lambda$. The Jacobian is not needed for each of 854*9b92b1d3SBarry Smith those and one does not know in advance which will be the final $\lambda$ until 855*9b92b1d3SBarry Smith after the function value is computed, so many extra Jacobians may be computed. 856*9b92b1d3SBarry Smith- In the final step if $|| F(u^p)||$ satisfies the convergence criteria then a 857*9b92b1d3SBarry Smith Jacobian need not be computed. 858*9b92b1d3SBarry Smith 859*9b92b1d3SBarry SmithYou are free to have your `FormFunction()` compute as much of the Jacobian at that point 860*9b92b1d3SBarry Smithas you like, keep the information in the user context (the final argument to 861*9b92b1d3SBarry Smith`FormFunction()` and `FormJacobian()`) and then retrieve the information in your 862*9b92b1d3SBarry Smith`FormJacobian()` function. 863*9b92b1d3SBarry Smith 864*9b92b1d3SBarry Smith### Computing the Jacobian or preconditioner is time consuming. Is there any way to compute it less often? 865*9b92b1d3SBarry Smith 866*9b92b1d3SBarry SmithPETSc has a variety of ways of lagging the computation of the Jacobian or the 867*9b92b1d3SBarry Smithpreconditioner. They are documented in the manual page for `SNESComputeJacobian()` 868*9b92b1d3SBarry Smithand in the {ref}`users manual <ch_snes>`: 869*9b92b1d3SBarry Smith 870*9b92b1d3SBarry Smith-s 871*9b92b1d3SBarry Smith 872*9b92b1d3SBarry Smithnes_lag_jacobian 873*9b92b1d3SBarry Smith 874*9b92b1d3SBarry Smith(`SNESSetLagJacobian()`) How often Jacobian is rebuilt (use -1 to 875*9b92b1d3SBarry Smithnever rebuild, use -2 to rebuild the next time requested and then 876*9b92b1d3SBarry Smithnever again). 877*9b92b1d3SBarry Smith 878*9b92b1d3SBarry Smith-s 879*9b92b1d3SBarry Smith 880*9b92b1d3SBarry Smithnes_lag_jacobian_persists 881*9b92b1d3SBarry Smith 882*9b92b1d3SBarry Smith(`SNESSetLagJacobianPersists()`) Forces lagging of Jacobian 883*9b92b1d3SBarry Smiththrough multiple `SNES` solves , same as passing -2 to 884*9b92b1d3SBarry Smith`-snes_lag_jacobian`. By default, each new `SNES` solve 885*9b92b1d3SBarry Smithnormally triggers a recomputation of the Jacobian. 886*9b92b1d3SBarry Smith 887*9b92b1d3SBarry Smith-s 888*9b92b1d3SBarry Smith 889*9b92b1d3SBarry Smithnes_lag_preconditioner 890*9b92b1d3SBarry Smith 891*9b92b1d3SBarry Smith(`SNESSetLagPreconditioner()`) how often the preconditioner is 892*9b92b1d3SBarry Smithrebuilt. Note: if you are lagging the Jacobian the system will 893*9b92b1d3SBarry Smithknow that the matrix has not changed and will not recompute the 894*9b92b1d3SBarry Smith(same) preconditioner. 895*9b92b1d3SBarry Smith 896*9b92b1d3SBarry Smith-s 897*9b92b1d3SBarry Smith 898*9b92b1d3SBarry Smithnes_lag_preconditioner_persists 899*9b92b1d3SBarry Smith 900*9b92b1d3SBarry Smith(`SNESSetLagPreconditionerPersists()`) Preconditioner 901*9b92b1d3SBarry Smithlags through multiple `SNES` solves 902*9b92b1d3SBarry Smith 903*9b92b1d3SBarry Smith:::{note} 904*9b92b1d3SBarry SmithThese are often (but does not need to be) used in combination with 905*9b92b1d3SBarry Smith`-snes_mf_operator` which applies the fresh Jacobian matrix-free for every 906*9b92b1d3SBarry Smithmatrix-vector product. Otherwise the out-of-date matrix vector product, computed with 907*9b92b1d3SBarry Smiththe lagged Jacobian will be used. 908*9b92b1d3SBarry Smith::: 909*9b92b1d3SBarry Smith 910*9b92b1d3SBarry SmithBy using `KSPMonitorSet()` and/or `SNESMonitorSet()` one can provide code that monitors the 911*9b92b1d3SBarry Smithconvergence rate and automatically triggers an update of the Jacobian or preconditioner 912*9b92b1d3SBarry Smithbased on decreasing convergence of the iterative method. For example if the number of `SNES` 913*9b92b1d3SBarry Smithiterations doubles one might trigger a new computation of the Jacobian. Experimentation is 914*9b92b1d3SBarry Smiththe only general purpose way to determine which approach is best for your problem. 915*9b92b1d3SBarry Smith 916*9b92b1d3SBarry Smith:::{important} 917*9b92b1d3SBarry SmithIt is also vital to experiment on your true problem at the scale you will be solving 918*9b92b1d3SBarry Smiththe problem since the performance benefits depend on the exact problem and the problem 919*9b92b1d3SBarry Smithsize! 920*9b92b1d3SBarry Smith::: 921*9b92b1d3SBarry Smith 922*9b92b1d3SBarry Smith### How can I use Newton's Method Jacobian free? Can I difference a different function than provided with `SNESSetFunction()`? 923*9b92b1d3SBarry Smith 924*9b92b1d3SBarry SmithThe simplest way is with the option `-snes_mf`, this will use finite differencing of the 925*9b92b1d3SBarry Smithfunction provided to `SNESComputeFunction()` to approximate the action of Jacobian. 926*9b92b1d3SBarry Smith 927*9b92b1d3SBarry Smith:::{important} 928*9b92b1d3SBarry SmithSince no matrix-representation of the Jacobian is provided the `-pc_type` used with 929*9b92b1d3SBarry Smiththis option must be `-pc_type none`. You may provide a custom preconditioner with 930*9b92b1d3SBarry Smith`SNESGetKSP()`, `KSPGetPC()`, and `PCSetType()` and use `PCSHELL`. 931*9b92b1d3SBarry Smith::: 932*9b92b1d3SBarry Smith 933*9b92b1d3SBarry SmithThe option `-snes_mf_operator` will use Jacobian free to apply the Jacobian (in the 934*9b92b1d3SBarry SmithKrylov solvers) but will use whatever matrix you provided with `SNESSetJacobian()` 935*9b92b1d3SBarry Smith(assuming you set one) to compute the preconditioner. 936*9b92b1d3SBarry Smith 937*9b92b1d3SBarry SmithTo write the code (rather than use the options above) use `MatCreateSNESMF()` and pass 938*9b92b1d3SBarry Smiththe resulting matrix object to `SNESSetJacobian()`. 939*9b92b1d3SBarry Smith 940*9b92b1d3SBarry SmithFor purely matrix-free (like `-snes_mf`) pass the matrix object for both matrix 941*9b92b1d3SBarry Smitharguments and pass the function `MatMFFDComputeJacobian()`. 942*9b92b1d3SBarry Smith 943*9b92b1d3SBarry SmithTo provide your own approximate Jacobian matrix to compute the preconditioner (like 944*9b92b1d3SBarry Smith`-snes_mf_operator`), pass this other matrix as the second matrix argument to 945*9b92b1d3SBarry Smith`SNESSetJacobian()`. Make sure your provided `computejacobian()` function calls 946*9b92b1d3SBarry Smith`MatAssemblyBegin()` and `MatAssemblyEnd()` separately on **BOTH** matrix arguments 947*9b92b1d3SBarry Smithto this function. See `src/snes/tests/ex7.c`. 948*9b92b1d3SBarry Smith 949*9b92b1d3SBarry SmithTo difference a different function than that passed to `SNESSetJacobian()` to compute the 950*9b92b1d3SBarry Smithmatrix-free Jacobian multiply call `MatMFFDSetFunction()` to set that other function. See 951*9b92b1d3SBarry Smith`src/snes/tests/ex7.c.h`. 952*9b92b1d3SBarry Smith 953*9b92b1d3SBarry Smith(doc_faq_usage_condnum)= 954*9b92b1d3SBarry Smith 955*9b92b1d3SBarry Smith### How can I determine the condition number of a matrix? 956*9b92b1d3SBarry Smith 957*9b92b1d3SBarry SmithFor small matrices, the condition number can be reliably computed using 958*9b92b1d3SBarry Smith 959*9b92b1d3SBarry Smith```text 960*9b92b1d3SBarry Smith-pc_type svd -pc_svd_monitor 961*9b92b1d3SBarry Smith``` 962*9b92b1d3SBarry Smith 963*9b92b1d3SBarry SmithFor larger matrices, you can run with 964*9b92b1d3SBarry Smith 965*9b92b1d3SBarry Smith```text 966*9b92b1d3SBarry Smith-pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000 967*9b92b1d3SBarry Smith``` 968*9b92b1d3SBarry Smith 969*9b92b1d3SBarry Smithto get approximations to the condition number of the operator. This will generally be 970*9b92b1d3SBarry Smithaccurate for the largest singular values, but may overestimate the smallest singular value 971*9b92b1d3SBarry Smithunless the method has converged. Make sure to avoid restarts. To estimate the condition 972*9b92b1d3SBarry Smithnumber of the preconditioned operator, use `-pc_type somepc` in the last command. 973*9b92b1d3SBarry Smith 974*9b92b1d3SBarry SmithYou can use [SLEPc](https://slepc.upv.es) for highly scalable, efficient, and quality eigenvalue computations. 975*9b92b1d3SBarry Smith 976*9b92b1d3SBarry Smith### How can I compute the inverse of a matrix in PETSc? 977*9b92b1d3SBarry Smith 978*9b92b1d3SBarry Smith:::{admonition} Are you sure? 979*9b92b1d3SBarry Smith:class: yellow 980*9b92b1d3SBarry Smith 981*9b92b1d3SBarry SmithIt is very expensive to compute the inverse of a matrix and very rarely needed in 982*9b92b1d3SBarry Smithpractice. We highly recommend avoiding algorithms that need it. 983*9b92b1d3SBarry Smith::: 984*9b92b1d3SBarry Smith 985*9b92b1d3SBarry SmithThe inverse of a matrix (dense or sparse) is essentially always dense, so begin by 986*9b92b1d3SBarry Smithcreating a dense matrix B and fill it with the identity matrix (ones along the diagonal), 987*9b92b1d3SBarry Smithalso create a dense matrix X of the same size that will hold the solution. Then factor the 988*9b92b1d3SBarry Smithmatrix you wish to invert with `MatLUFactor()` or `MatCholeskyFactor()`, call the 989*9b92b1d3SBarry Smithresult A. Then call `MatMatSolve(A,B,X)` to compute the inverse into X. See also section 990*9b92b1d3SBarry Smithon {any}`Schur's complement <how_can_i_compute_the_schur_complement>`. 991*9b92b1d3SBarry Smith 992*9b92b1d3SBarry Smith(how_can_i_compute_the_schur_complement)= 993*9b92b1d3SBarry Smith 994*9b92b1d3SBarry Smith### How can I compute the Schur complement in PETSc? 995*9b92b1d3SBarry Smith 996*9b92b1d3SBarry Smith:::{admonition} Are you sure? 997*9b92b1d3SBarry Smith:class: yellow 998*9b92b1d3SBarry Smith 999*9b92b1d3SBarry SmithIt is very expensive to compute the Schur complement of a matrix and very rarely needed 1000*9b92b1d3SBarry Smithin practice. We highly recommend avoiding algorithms that need it. 1001*9b92b1d3SBarry Smith::: 1002*9b92b1d3SBarry Smith 1003*9b92b1d3SBarry SmithThe Schur complement of the matrix $M \in \mathbb{R}^{\left(p+q \right) \times 1004*9b92b1d3SBarry Smith\left(p + q \right)}$ 1005*9b92b1d3SBarry Smith 1006*9b92b1d3SBarry Smith$$ 1007*9b92b1d3SBarry SmithM = \begin{bmatrix} 1008*9b92b1d3SBarry SmithA & B \\ 1009*9b92b1d3SBarry SmithC & D 1010*9b92b1d3SBarry Smith\end{bmatrix} 1011*9b92b1d3SBarry Smith$$ 1012*9b92b1d3SBarry Smith 1013*9b92b1d3SBarry Smithwhere 1014*9b92b1d3SBarry Smith 1015*9b92b1d3SBarry Smith$$ 1016*9b92b1d3SBarry SmithA \in \mathbb{R}^{p \times p}, \quad B \in \mathbb{R}^{p \times q}, \quad C \in \mathbb{R}^{q \times p}, \quad D \in \mathbb{R}^{q \times q} 1017*9b92b1d3SBarry Smith$$ 1018*9b92b1d3SBarry Smith 1019*9b92b1d3SBarry Smithis given by 1020*9b92b1d3SBarry Smith 1021*9b92b1d3SBarry Smith$$ 1022*9b92b1d3SBarry SmithS_D := A - BD^{-1}C \\ 1023*9b92b1d3SBarry SmithS_A := D - CA^{-1}B 1024*9b92b1d3SBarry Smith$$ 1025*9b92b1d3SBarry Smith 1026*9b92b1d3SBarry SmithLike the inverse, the Schur complement of a matrix (dense or sparse) is essentially always 1027*9b92b1d3SBarry Smithdense, so assuming you wish to calculate $S_A = D - C \underbrace{ 1028*9b92b1d3SBarry Smith\overbrace{(A^{-1})}^{U} B}_{V}$ begin by: 1029*9b92b1d3SBarry Smith 1030*9b92b1d3SBarry Smith1. Forming a dense matrix $B$ 1031*9b92b1d3SBarry Smith2. Also create another dense matrix $V$ of the same size. 1032*9b92b1d3SBarry Smith3. Then either factor the matrix $A$ directly with `MatLUFactor()` or 1033*9b92b1d3SBarry Smith `MatCholeskyFactor()`, or use `MatGetFactor()` followed by 1034*9b92b1d3SBarry Smith `MatLUFactorSymbolic()` followed by `MatLUFactorNumeric()` if you wish to use and 1035*9b92b1d3SBarry Smith external solver package like SuperLU_Dist. Call the result $U$. 1036*9b92b1d3SBarry Smith4. Then call `MatMatSolve(U,B,V)`. 1037*9b92b1d3SBarry Smith5. Then call `MatMatMult(C,V,MAT_INITIAL_MATRIX,1.0,&S)`. 1038*9b92b1d3SBarry Smith6. Now call `MatAXPY(S,-1.0,D,MAT_SUBSET_NONZERO)`. 1039*9b92b1d3SBarry Smith7. Followed by `MatScale(S,-1.0)`. 1040*9b92b1d3SBarry Smith 1041*9b92b1d3SBarry SmithFor computing Schur complements like this it does not make sense to use the `KSP` 1042*9b92b1d3SBarry Smithiterative solvers since for solving many moderate size problems using a direct 1043*9b92b1d3SBarry Smithfactorization is much faster than iterative solvers. As you can see, this requires a great 1044*9b92b1d3SBarry Smithdeal of work space and computation so is best avoided. 1045*9b92b1d3SBarry Smith 1046*9b92b1d3SBarry SmithHowever, it is not necessary to assemble the Schur complement $S$ in order to solve 1047*9b92b1d3SBarry Smithsystems with it. Use `MatCreateSchurComplement(A,A_pre,B,C,D,&S)` to create a 1048*9b92b1d3SBarry Smithmatrix that applies the action of $S$ (using `A_pre` to solve with `A`), but 1049*9b92b1d3SBarry Smithdoes not assemble. 1050*9b92b1d3SBarry Smith 1051*9b92b1d3SBarry SmithAlternatively, if you already have a block matrix `M = [A, B; C, D]` (in some 1052*9b92b1d3SBarry Smithordering), then you can create index sets (`IS`) `isa` and `isb` to address each 1053*9b92b1d3SBarry Smithblock, then use `MatGetSchurComplement()` to create the Schur complement and/or an 1054*9b92b1d3SBarry Smithapproximation suitable for preconditioning. 1055*9b92b1d3SBarry Smith 1056*9b92b1d3SBarry SmithSince $S$ is generally dense, standard preconditioning methods cannot typically be 1057*9b92b1d3SBarry Smithapplied directly to Schur complements. There are many approaches to preconditioning Schur 1058*9b92b1d3SBarry Smithcomplements including using the `SIMPLE` approximation 1059*9b92b1d3SBarry Smith 1060*9b92b1d3SBarry Smith$$ 1061*9b92b1d3SBarry SmithD - C \text{diag}(A)^{-1} B 1062*9b92b1d3SBarry Smith$$ 1063*9b92b1d3SBarry Smith 1064*9b92b1d3SBarry Smithto create a sparse matrix that approximates the Schur complement (this is returned by 1065*9b92b1d3SBarry Smithdefault for the optional "preconditioning" matrix in `MatGetSchurComplement()`). 1066*9b92b1d3SBarry Smith 1067*9b92b1d3SBarry SmithAn alternative is to interpret the matrices as differential operators and apply 1068*9b92b1d3SBarry Smithapproximate commutator arguments to find a spectrally equivalent operation that can be 1069*9b92b1d3SBarry Smithapplied efficiently (see the "PCD" preconditioners {cite}`elman_silvester_wathen_2014`). A 1070*9b92b1d3SBarry Smithvariant of this is the least squares commutator, which is closely related to the 1071*9b92b1d3SBarry SmithMoore-Penrose pseudoinverse, and is available in `PCLSC` which operates on matrices of 1072*9b92b1d3SBarry Smithtype `MATSCHURCOMPLEMENT`. 1073*9b92b1d3SBarry Smith 1074*9b92b1d3SBarry Smith### Do you have examples of doing unstructured grid Finite Element Method (FEM) with PETSc? 1075*9b92b1d3SBarry Smith 1076*9b92b1d3SBarry SmithThere are at least three ways to write finite element codes using PETSc: 1077*9b92b1d3SBarry Smith 1078*9b92b1d3SBarry Smith1. Use `DMPLEX`, which is a high level approach to manage your mesh and 1079*9b92b1d3SBarry Smith discretization. See the {ref}`tutorials sections <tut_stokes>` for further information, 1080*9b92b1d3SBarry Smith or see `src/snes/tutorial/ex62.c`. 1081*9b92b1d3SBarry Smith2. Use packages such as [deal.ii](https://www.dealii.org), [libMesh](https://libmesh.github.io), or 1082*9b92b1d3SBarry Smith [Firedrake](https://www.firedrakeproject.org), which use PETSc for the solvers. 1083*9b92b1d3SBarry Smith3. Manage the grid data structure yourself and use PETSc `PetscSF`, `IS`, and `VecScatter` to 1084*9b92b1d3SBarry Smith communicate the required ghost point communication. See 1085*9b92b1d3SBarry Smith `src/snes/tutorials/ex10d/ex10.c`. 1086*9b92b1d3SBarry Smith 1087*9b92b1d3SBarry Smith### DMDA decomposes the domain differently than the MPI_Cart_create() command. How can one use them together? 1088*9b92b1d3SBarry Smith 1089*9b92b1d3SBarry SmithThe `MPI_Cart_create()` first divides the mesh along the z direction, then the y, then 1090*9b92b1d3SBarry Smiththe x. `DMDA` divides along the x, then y, then z. Thus, for example, rank 1 of the 1091*9b92b1d3SBarry Smithprocesses will be in a different part of the mesh for the two schemes. To resolve this you 1092*9b92b1d3SBarry Smithcan create a new MPI communicator that you pass to `DMDACreate()` that renumbers the 1093*9b92b1d3SBarry Smithprocess ranks so that each physical process shares the same part of the mesh with both the 1094*9b92b1d3SBarry Smith`DMDA` and the `MPI_Cart_create()`. The code to determine the new numbering was 1095*9b92b1d3SBarry Smithprovided by Rolf Kuiper: 1096*9b92b1d3SBarry Smith 1097*9b92b1d3SBarry Smith``` 1098*9b92b1d3SBarry Smith// the numbers of processors per direction are (int) x_procs, y_procs, z_procs respectively 1099*9b92b1d3SBarry Smith// (no parallelization in direction 'dir' means dir_procs = 1) 1100*9b92b1d3SBarry Smith 1101*9b92b1d3SBarry SmithMPI_Comm NewComm; 1102*9b92b1d3SBarry Smithint x, y, z; 1103*9b92b1d3SBarry SmithPetscMPIInt MPI_Rank, NewRank; 1104*9b92b1d3SBarry Smith 1105*9b92b1d3SBarry Smith// get rank from MPI ordering: 1106*9b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &MPI_Rank)); 1107*9b92b1d3SBarry Smith 1108*9b92b1d3SBarry Smith// calculate coordinates of cpus in MPI ordering: 1109*9b92b1d3SBarry Smithx = MPI_rank / (z_procs*y_procs); 1110*9b92b1d3SBarry Smithy = (MPI_rank % (z_procs*y_procs)) / z_procs; 1111*9b92b1d3SBarry Smithz = (MPI_rank % (z_procs*y_procs)) % z_procs; 1112*9b92b1d3SBarry Smith 1113*9b92b1d3SBarry Smith// set new rank according to PETSc ordering: 1114*9b92b1d3SBarry SmithNewRank = z*y_procs*x_procs + y*x_procs + x; 1115*9b92b1d3SBarry Smith 1116*9b92b1d3SBarry Smith// create communicator with new ranks according to PETSc ordering 1117*9b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &NewComm)); 1118*9b92b1d3SBarry Smith 1119*9b92b1d3SBarry Smith// override the default communicator (was MPI_COMM_WORLD as default) 1120*9b92b1d3SBarry SmithPETSC_COMM_WORLD = NewComm; 1121*9b92b1d3SBarry Smith``` 1122*9b92b1d3SBarry Smith 1123*9b92b1d3SBarry Smith### When solving a system with Dirichlet boundary conditions I can use MatZeroRows() to eliminate the Dirichlet rows but this results in a non-Symmetric system. How can I apply Dirichlet boundary conditions but keep the matrix symmetric? 1124*9b92b1d3SBarry Smith 1125*9b92b1d3SBarry Smith- For nonsymmetric systems put the appropriate boundary solutions in the x vector and use 1126*9b92b1d3SBarry Smith `MatZeroRows()` followed by `KSPSetOperators()`. 1127*9b92b1d3SBarry Smith- For symmetric problems use `MatZeroRowsColumns()`. 1128*9b92b1d3SBarry Smith- If you have many Dirichlet locations you can use `MatZeroRows()` (**not** 1129*9b92b1d3SBarry Smith `MatZeroRowsColumns()`) and `-ksp_type preonly -pc_type redistribute` (see 1130*9b92b1d3SBarry Smith `PCREDISTRIBUTE`) and PETSc will repartition the parallel matrix for load 1131*9b92b1d3SBarry Smith balancing. In this case the new matrix solved remains symmetric even though 1132*9b92b1d3SBarry Smith `MatZeroRows()` is used. 1133*9b92b1d3SBarry Smith 1134*9b92b1d3SBarry SmithAn alternative approach is, when assembling the matrix (generating values and passing 1135*9b92b1d3SBarry Smiththem to the matrix), never to include locations for the Dirichlet grid points in the 1136*9b92b1d3SBarry Smithvector and matrix, instead taking them into account as you put the other values into the 1137*9b92b1d3SBarry Smithload. 1138*9b92b1d3SBarry Smith 1139*9b92b1d3SBarry Smith### How can I get PETSc vectors and matrices to MATLAB or vice versa? 1140*9b92b1d3SBarry Smith 1141*9b92b1d3SBarry SmithThere are numerous ways to work with PETSc and MATLAB. All but the first approach 1142*9b92b1d3SBarry Smithrequire PETSc to be configured with --with-matlab. 1143*9b92b1d3SBarry Smith 1144*9b92b1d3SBarry Smith1. To save PETSc `Mat` and `Vec` to files that can be read from MATLAB use 1145*9b92b1d3SBarry Smith `PetscViewerBinaryOpen()` viewer and `VecView()` or `MatView()` to save objects 1146*9b92b1d3SBarry Smith for MATLAB and `VecLoad()` and `MatLoad()` to get the objects that MATLAB has 1147*9b92b1d3SBarry Smith saved. See `share/petsc/matlab/PetscBinaryRead.m` and 1148*9b92b1d3SBarry Smith `share/petsc/matlab/PetscBinaryWrite.m` for loading and saving the objects in MATLAB. 1149*9b92b1d3SBarry Smith2. Using the [MATLAB Engine](https://www.mathworks.com/help/matlab/calling-matlab-engine-from-c-programs-1.html), 1150*9b92b1d3SBarry Smith allows PETSc to automatically call MATLAB to perform some specific computations. This 1151*9b92b1d3SBarry Smith does not allow MATLAB to be used interactively by the user. See the 1152*9b92b1d3SBarry Smith `PetscMatlabEngine`. 1153*9b92b1d3SBarry Smith3. You can open a socket connection between MATLAB and PETSc to allow sending objects back 1154*9b92b1d3SBarry Smith and forth between an interactive MATLAB session and a running PETSc program. See 1155*9b92b1d3SBarry Smith `PetscViewerSocketOpen()` for access from the PETSc side and 1156*9b92b1d3SBarry Smith `share/petsc/matlab/PetscReadBinary.m` for access from the MATLAB side. 1157*9b92b1d3SBarry Smith4. You can save PETSc `Vec` (**not** `Mat`) with the `PetscViewerMatlabOpen()` 1158*9b92b1d3SBarry Smith viewer that saves `.mat` files can then be loaded into MATLAB using the `load()` command 1159*9b92b1d3SBarry Smith 1160*9b92b1d3SBarry Smith### How do I get started with Cython so that I can extend petsc4py? 1161*9b92b1d3SBarry Smith 1162*9b92b1d3SBarry Smith1. Learn how to [build a Cython module](http://docs.cython.org/src/quickstart/build.html). 1163*9b92b1d3SBarry Smith2. Go through the simple [example](https://stackoverflow.com/questions/3046305/simple-wrapping-of-c-code-with-cython). Note 1164*9b92b1d3SBarry Smith also the next comment that shows how to create numpy arrays in the Cython and pass them 1165*9b92b1d3SBarry Smith back. 1166*9b92b1d3SBarry Smith3. Check out [this page](http://docs.cython.org/src/tutorial/numpy.html) which tells 1167*9b92b1d3SBarry Smith you how to get fast indexing. 1168*9b92b1d3SBarry Smith4. Have a look at the petsc4py [array source](http://code.google.com/p/petsc4py/source/browse/src/PETSc/arraynpy.pxi). 1169*9b92b1d3SBarry Smith 1170*9b92b1d3SBarry Smith### How do I compute a custom norm for KSP to use as a convergence test or for monitoring? 1171*9b92b1d3SBarry Smith 1172*9b92b1d3SBarry SmithYou need to call `KSPBuildResidual()` on your `KSP` object and then compute the 1173*9b92b1d3SBarry Smithappropriate norm on the resulting residual. Note that depending on the 1174*9b92b1d3SBarry Smith`KSPSetNormType()` of the method you may not return the same norm as provided by the 1175*9b92b1d3SBarry Smithmethod. See also `KSPSetPCSide()`. 1176*9b92b1d3SBarry Smith 1177*9b92b1d3SBarry Smith### If I have a sequential program can I use a PETSc parallel solver? 1178*9b92b1d3SBarry Smith 1179*9b92b1d3SBarry Smith:::{important} 1180*9b92b1d3SBarry SmithDo not expect to get great speedups! Much of the speedup gained by using parallel 1181*9b92b1d3SBarry Smithsolvers comes from building the underlying matrices and vectors in parallel to begin 1182*9b92b1d3SBarry Smithwith. You should see some reduction in the time for the linear solvers. 1183*9b92b1d3SBarry Smith::: 1184*9b92b1d3SBarry Smith 1185*9b92b1d3SBarry SmithYes, you must set up PETSc with MPI (even though you will not use MPI) with at least the 1186*9b92b1d3SBarry Smithfollowing options: 1187*9b92b1d3SBarry Smith 1188*9b92b1d3SBarry Smith```console 1189*9b92b1d3SBarry Smith$ ./configure --download-superlu_dist --download-parmetis --download-metis --with-openmp 1190*9b92b1d3SBarry Smith``` 1191*9b92b1d3SBarry Smith 1192*9b92b1d3SBarry SmithYour compiler must support OpenMP. To have the linear solver run in parallel run your 1193*9b92b1d3SBarry Smithprogram with 1194*9b92b1d3SBarry Smith 1195*9b92b1d3SBarry Smith```console 1196*9b92b1d3SBarry Smith$ OMP_NUM_THREADS=n ./myprog -pc_type lu -pc_factor_mat_solver superlu_dist 1197*9b92b1d3SBarry Smith``` 1198*9b92b1d3SBarry Smith 1199*9b92b1d3SBarry Smithwhere `n` is the number of threads and should be less than or equal to the number of cores 1200*9b92b1d3SBarry Smithavailable. 1201*9b92b1d3SBarry Smith 1202*9b92b1d3SBarry Smith:::{note} 1203*9b92b1d3SBarry SmithIf your code is MPI parallel you can also use these same options to have SuperLU_dist 1204*9b92b1d3SBarry Smithutilize multiple threads per MPI process for the direct solver. Make sure that the 1205*9b92b1d3SBarry Smith`$OMP_NUM_THREADS` you use per MPI process is less than or equal to the number of 1206*9b92b1d3SBarry Smithcores available for each MPI process. For example if your compute nodes have 6 cores 1207*9b92b1d3SBarry Smithand you use 2 MPI processes per node then set `$OMP_NUM_THREADS` to 2 or 3. 1208*9b92b1d3SBarry Smith::: 1209*9b92b1d3SBarry Smith 1210*9b92b1d3SBarry SmithAnother approach that allows using a PETSc parallel solver is to use `PCMPI`. 1211*9b92b1d3SBarry Smith 1212*9b92b1d3SBarry Smith### TS or SNES produces infeasible (out of domain) solutions or states. How can I prevent this? 1213*9b92b1d3SBarry Smith 1214*9b92b1d3SBarry SmithFor `TS` call `TSSetFunctionDomainError()`. For both `TS` and `SNES` call 1215*9b92b1d3SBarry Smith`SNESSetFunctionDomainError()` when the solver passes an infeasible (out of domain) 1216*9b92b1d3SBarry Smithsolution or state to your routines. 1217*9b92b1d3SBarry Smith 1218*9b92b1d3SBarry SmithIf it occurs for DAEs, it is important to insure the algebraic constraints are well 1219*9b92b1d3SBarry Smithsatisfied, which can prevent "breakdown" later. Thus, one can try using a tight tolerance 1220*9b92b1d3SBarry Smithfor `SNES`, using a direct linear solver (`PCType` of `PCLU`) when possible, and reducing the timestep (or 1221*9b92b1d3SBarry Smithtightening `TS` tolerances for adaptive time stepping). 1222*9b92b1d3SBarry Smith 1223*9b92b1d3SBarry Smith### Can PETSc work with Hermitian matrices? 1224*9b92b1d3SBarry Smith 1225*9b92b1d3SBarry SmithPETSc's support of Hermitian matrices is limited. Many operations and solvers work 1226*9b92b1d3SBarry Smithfor symmetric (`MATSBAIJ`) matrices and operations on transpose matrices but there is 1227*9b92b1d3SBarry Smithlittle direct support for Hermitian matrices and Hermitian transpose (complex conjugate 1228*9b92b1d3SBarry Smithtranspose) operations. There is `KSPSolveTranspose()` for solving the transpose of a 1229*9b92b1d3SBarry Smithlinear system but no `KSPSolveHermitian()`. 1230*9b92b1d3SBarry Smith 1231*9b92b1d3SBarry SmithFor creating known Hermitian matrices: 1232*9b92b1d3SBarry Smith 1233*9b92b1d3SBarry Smith- `MatCreateNormalHermitian()` 1234*9b92b1d3SBarry Smith- `MatCreateHermitianTranspose()` 1235*9b92b1d3SBarry Smith 1236*9b92b1d3SBarry SmithFor determining or setting Hermitian status on existing matrices: 1237*9b92b1d3SBarry Smith 1238*9b92b1d3SBarry Smith- `MatIsHermitian()` 1239*9b92b1d3SBarry Smith- `MatIsHermitianKnown()` 1240*9b92b1d3SBarry Smith- `MatIsStructurallySymmetric()` 1241*9b92b1d3SBarry Smith- `MatIsSymmetricKnown()` 1242*9b92b1d3SBarry Smith- `MatIsSymmetric()` 1243*9b92b1d3SBarry Smith- `MatSetOption()` (use with `MAT_SYMMETRIC` or `MAT_HERMITIAN` to assert to PETSc 1244*9b92b1d3SBarry Smith that either is the case). 1245*9b92b1d3SBarry Smith 1246*9b92b1d3SBarry SmithFor performing matrix operations on known Hermitian matrices (note that regular `Mat` 1247*9b92b1d3SBarry Smithfunctions such as `MatMult()` will of course also work on Hermitian matrices): 1248*9b92b1d3SBarry Smith 1249*9b92b1d3SBarry Smith- `MatMultHermitianTranspose()` 1250*9b92b1d3SBarry Smith- `MatMultHermitianTransposeAdd()` (very limited support) 1251*9b92b1d3SBarry Smith 1252*9b92b1d3SBarry Smith### How can I assemble a bunch of similar matrices? 1253*9b92b1d3SBarry Smith 1254*9b92b1d3SBarry SmithYou can first add the values common to all the matrices, then use `MatStoreValues()` to 1255*9b92b1d3SBarry Smithstash the common values. Each iteration you call `MatRetrieveValues()`, then set the 1256*9b92b1d3SBarry Smithunique values in a matrix and assemble. 1257*9b92b1d3SBarry Smith 1258*9b92b1d3SBarry Smith### Can one resize or change the size of PETSc matrices or vectors? 1259*9b92b1d3SBarry Smith 1260*9b92b1d3SBarry SmithNo, once the vector or matrices sizes have been set and the matrices or vectors are fully 1261*9b92b1d3SBarry Smithusable one cannot change the size of the matrices or vectors or number of processors they 1262*9b92b1d3SBarry Smithlive on. One may create new vectors and copy, for example using `VecScatterCreate()`, 1263*9b92b1d3SBarry Smiththe old values from the previous vector. 1264*9b92b1d3SBarry Smith 1265*9b92b1d3SBarry Smith### How can one compute the nullspace of a sparse matrix with MUMPS? 1266*9b92b1d3SBarry Smith 1267*9b92b1d3SBarry SmithAssuming you have an existing matrix $A$ whose nullspace $V$ you want to find: 1268*9b92b1d3SBarry Smith 1269*9b92b1d3SBarry Smith``` 1270*9b92b1d3SBarry SmithMat F, work, V; 1271*9b92b1d3SBarry SmithPetscInt N, rows; 1272*9b92b1d3SBarry Smith 1273*9b92b1d3SBarry Smith/* Determine factorability */ 1274*9b92b1d3SBarry SmithPetscCall(MatGetFactor(A, MATSOLVERMUMPS, MAT_FACTOR_LU, &F)); 1275*9b92b1d3SBarry SmithPetscCall(MatGetLocalSize(A, &rows, NULL)); 1276*9b92b1d3SBarry Smith 1277*9b92b1d3SBarry Smith/* Set MUMPS options, see MUMPS documentation for more information */ 1278*9b92b1d3SBarry SmithPetscCall(MatMumpsSetIcntl(F, 24, 1)); 1279*9b92b1d3SBarry SmithPetscCall(MatMumpsSetIcntl(F, 25, 1)); 1280*9b92b1d3SBarry Smith 1281*9b92b1d3SBarry Smith/* Perform factorization */ 1282*9b92b1d3SBarry SmithPetscCall(MatLUFactorSymbolic(F, A, NULL, NULL, NULL)); 1283*9b92b1d3SBarry SmithPetscCall(MatLUFactorNumeric(F, A, NULL)); 1284*9b92b1d3SBarry Smith 1285*9b92b1d3SBarry Smith/* This is the dimension of the null space */ 1286*9b92b1d3SBarry SmithPetscCall(MatMumpsGetInfog(F, 28, &N)); 1287*9b92b1d3SBarry Smith 1288*9b92b1d3SBarry Smith/* This will contain the null space in the columns */ 1289*9b92b1d3SBarry SmithPetscCall(MatCreateDense(comm, rows, N, PETSC_DETERMINE, PETSC_DETERMINE, NULL, &V)); 1290*9b92b1d3SBarry Smith 1291*9b92b1d3SBarry SmithPetscCall(MatDuplicate(V, MAT_DO_NOT_COPY_VALUES, &work)); 1292*9b92b1d3SBarry SmithPetscCall(MatMatSolve(F, work, V)); 1293*9b92b1d3SBarry Smith``` 1294*9b92b1d3SBarry Smith 1295*9b92b1d3SBarry Smith______________________________________________________________________ 1296*9b92b1d3SBarry Smith 1297*9b92b1d3SBarry Smith## Execution 1298*9b92b1d3SBarry Smith 1299*9b92b1d3SBarry Smith### PETSc executables are SO big and take SO long to link 1300*9b92b1d3SBarry Smith 1301*9b92b1d3SBarry Smith:::{note} 1302*9b92b1d3SBarry SmithSee {ref}`shared libraries section <doc_faq_sharedlibs>` for more information. 1303*9b92b1d3SBarry Smith::: 1304*9b92b1d3SBarry Smith 1305*9b92b1d3SBarry SmithWe find this annoying as well. On most machines PETSc can use shared libraries, so 1306*9b92b1d3SBarry Smithexecutables should be much smaller, run `configure` with the additional option 1307*9b92b1d3SBarry Smith`--with-shared-libraries` (this is the default). Also, if you have room, compiling and 1308*9b92b1d3SBarry Smithlinking PETSc on your machine's `/tmp` disk or similar local disk, rather than over the 1309*9b92b1d3SBarry Smithnetwork will be much faster. 1310*9b92b1d3SBarry Smith 1311*9b92b1d3SBarry Smith### How does PETSc's `-help` option work? Why is it different for different programs? 1312*9b92b1d3SBarry Smith 1313*9b92b1d3SBarry SmithThere are 2 ways in which one interacts with the options database: 1314*9b92b1d3SBarry Smith 1315*9b92b1d3SBarry Smith- `PetscOptionsGetXXX()` where `XXX` is some type or data structure (for example 1316*9b92b1d3SBarry Smith `PetscOptionsGetBool()` or `PetscOptionsGetScalarArray()`). This is a classic 1317*9b92b1d3SBarry Smith "getter" function, which queries the command line options for a matching option name, 1318*9b92b1d3SBarry Smith and returns the specified value. 1319*9b92b1d3SBarry Smith- `PetscOptionsXXX()` where `XXX` is some type or data structure (for example 1320*9b92b1d3SBarry Smith `PetscOptionsBool()` or `PetscOptionsScalarArray()`). This is a so-called "provider" 1321*9b92b1d3SBarry Smith function. It first records the option name in an internal list of previously encountered 1322*9b92b1d3SBarry Smith options before calling `PetscOptionsGetXXX()` to query the status of said option. 1323*9b92b1d3SBarry Smith 1324*9b92b1d3SBarry SmithWhile users generally use the first option, developers will *always* use the second 1325*9b92b1d3SBarry Smith(provider) variant of functions. Thus, as the program runs, it will build up a list of 1326*9b92b1d3SBarry Smithencountered option names which are then printed **in the order of their appearance on the 1327*9b92b1d3SBarry Smithroot rank**. Different programs may take different paths through PETSc source code, so 1328*9b92b1d3SBarry Smiththey will encounter different providers, and therefore have different `-help` output. 1329*9b92b1d3SBarry Smith 1330*9b92b1d3SBarry Smith### PETSc has so many options for my program that it is hard to keep them straight 1331*9b92b1d3SBarry Smith 1332*9b92b1d3SBarry SmithRunning the PETSc program with the option `-help` will print out many of the options. To 1333*9b92b1d3SBarry Smithprint the options that have been specified within a program, employ `-options_left` to 1334*9b92b1d3SBarry Smithprint any options that the user specified but were not actually used by the program and 1335*9b92b1d3SBarry Smithall options used; this is helpful for detecting typo errors. The PETSc website has a search option, 1336*9b92b1d3SBarry Smithin the upper right hand corner, that quickly finds answers to most PETSc questions. 1337*9b92b1d3SBarry Smith 1338*9b92b1d3SBarry Smith### PETSc automatically handles many of the details in parallel PDE solvers. How can I understand what is really happening within my program? 1339*9b92b1d3SBarry Smith 1340*9b92b1d3SBarry SmithYou can use the option `-info` to get more details about the solution process. The 1341*9b92b1d3SBarry Smithoption `-log_view` provides details about the distribution of time spent in the various 1342*9b92b1d3SBarry Smithphases of the solution process. You can run with `-ts_view` or `-snes_view` or 1343*9b92b1d3SBarry Smith`-ksp_view` to see what solver options are being used. Run with `-ts_monitor`, 1344*9b92b1d3SBarry Smith`-snes_monitor`, or `-ksp_monitor` to watch convergence of the 1345*9b92b1d3SBarry Smithmethods. `-snes_converged_reason` and `-ksp_converged_reason` will indicate why and if 1346*9b92b1d3SBarry Smiththe solvers have converged. 1347*9b92b1d3SBarry Smith 1348*9b92b1d3SBarry Smith### Assembling large sparse matrices takes a long time. What can I do to make this process faster? Or MatSetValues() is so slow; what can I do to speed it up? 1349*9b92b1d3SBarry Smith 1350*9b92b1d3SBarry SmithYou probably need to do preallocation, as explained in {any}`sec_matsparse`. 1351*9b92b1d3SBarry SmithSee also the {ref}`performance chapter <ch_performance>` of the users manual. 1352*9b92b1d3SBarry Smith 1353*9b92b1d3SBarry SmithFor GPUs (and even CPUs) you can use `MatSetPreallocationCOO()` and `MatSetValuesCOO()` for more rapid assembly. 1354*9b92b1d3SBarry Smith 1355*9b92b1d3SBarry Smith### How can I generate performance summaries with PETSc? 1356*9b92b1d3SBarry Smith 1357*9b92b1d3SBarry SmithUse this option at runtime: 1358*9b92b1d3SBarry Smith 1359*9b92b1d3SBarry Smith-l 1360*9b92b1d3SBarry Smith 1361*9b92b1d3SBarry Smithog_view 1362*9b92b1d3SBarry Smith 1363*9b92b1d3SBarry SmithOutputs a comprehensive timing, memory consumption, and communications digest 1364*9b92b1d3SBarry Smithfor your program. See the {ref}`profiling chapter <ch_profiling>` of the users 1365*9b92b1d3SBarry Smithmanual for information on interpreting the summary data. 1366*9b92b1d3SBarry Smith 1367*9b92b1d3SBarry Smith### How do I know the amount of time spent on each level of the multigrid solver/preconditioner? 1368*9b92b1d3SBarry Smith 1369*9b92b1d3SBarry SmithRun with `-log_view` and `-pc_mg_log` 1370*9b92b1d3SBarry Smith 1371*9b92b1d3SBarry Smith### Where do I get the input matrices for the examples? 1372*9b92b1d3SBarry Smith 1373*9b92b1d3SBarry SmithSome examples use `$DATAFILESPATH/matrices/medium` and other files. These test matrices 1374*9b92b1d3SBarry Smithin PETSc binary format can be found in the [datafiles repository](https://gitlab.com/petsc/datafiles). 1375*9b92b1d3SBarry Smith 1376*9b92b1d3SBarry Smith### When I dump some matrices and vectors to binary, I seem to be generating some empty files with `.info` extensions. What's the deal with these? 1377*9b92b1d3SBarry Smith 1378*9b92b1d3SBarry SmithPETSc binary viewers put some additional information into `.info` files like matrix 1379*9b92b1d3SBarry Smithblock size. It is harmless but if you *really* don't like it you can use 1380*9b92b1d3SBarry Smith`-viewer_binary_skip_info` or `PetscViewerBinarySkipInfo()`. 1381*9b92b1d3SBarry Smith 1382*9b92b1d3SBarry Smith:::{note} 1383*9b92b1d3SBarry SmithYou need to call `PetscViewerBinarySkipInfo()` before 1384*9b92b1d3SBarry Smith`PetscViewerFileSetName()`. In other words you **cannot** use 1385*9b92b1d3SBarry Smith`PetscViewerBinaryOpen()` directly. 1386*9b92b1d3SBarry Smith::: 1387*9b92b1d3SBarry Smith 1388*9b92b1d3SBarry Smith### Why is my parallel solver slower than my sequential solver, or I have poor speed-up? 1389*9b92b1d3SBarry Smith 1390*9b92b1d3SBarry SmithThis can happen for many reasons: 1391*9b92b1d3SBarry Smith 1392*9b92b1d3SBarry Smith1. Make sure it is truly the time in `KSPSolve()` that is slower (by running the code 1393*9b92b1d3SBarry Smith with `-log_view`). Often the slower time is in generating the matrix or some other 1394*9b92b1d3SBarry Smith operation. 1395*9b92b1d3SBarry Smith2. There must be enough work for each process to outweigh the communication time. We 1396*9b92b1d3SBarry Smith recommend an absolute minimum of about 10,000 unknowns per process, better is 20,000 or 1397*9b92b1d3SBarry Smith more. This is even more true when using multiple GPUs, where you need to have millions 1398*9b92b1d3SBarry Smith of unknowns per GPU. 1399*9b92b1d3SBarry Smith3. Make sure the {ref}`communication speed of the parallel computer 1400*9b92b1d3SBarry Smith <doc_faq_general_parallel>` is good enough for parallel solvers. 1401*9b92b1d3SBarry Smith4. Check the number of solver iterates with the parallel solver against the sequential 1402*9b92b1d3SBarry Smith solver. Most preconditioners require more iterations when used on more processes, this 1403*9b92b1d3SBarry Smith is particularly true for block Jacobi (the default parallel preconditioner). You can 1404*9b92b1d3SBarry Smith try `-pc_type asm` (`PCASM`) its iterations scale a bit better for more 1405*9b92b1d3SBarry Smith processes. You may also consider multigrid preconditioners like `PCMG` or BoomerAMG 1406*9b92b1d3SBarry Smith in `PCHYPRE`. 1407*9b92b1d3SBarry Smith 1408*9b92b1d3SBarry Smith(doc_faq_pipelined)= 1409*9b92b1d3SBarry Smith 1410*9b92b1d3SBarry Smith### What steps are necessary to make the pipelined solvers execute efficiently? 1411*9b92b1d3SBarry Smith 1412*9b92b1d3SBarry SmithPipelined solvers like `KSPPGMRES`, `KSPPIPECG`, `KSPPIPECR`, and `KSPGROPPCG` may 1413*9b92b1d3SBarry Smithrequire special MPI configuration to effectively overlap reductions with computation. In 1414*9b92b1d3SBarry Smithgeneral, this requires an MPI-3 implementation, an implementation that supports multiple 1415*9b92b1d3SBarry Smiththreads, and use of a "progress thread". Consult the documentation from your vendor or 1416*9b92b1d3SBarry Smithcomputing facility for more details. 1417*9b92b1d3SBarry Smith 1418*9b92b1d3SBarry Smith- Cray MPI MPT-5.6 MPI-3, by setting `$MPICH_MAX_THREAD_SAFETY` to "multiple" 1419*9b92b1d3SBarry Smith for threads, plus either `$MPICH_ASYNC_PROGRESS` or 1420*9b92b1d3SBarry Smith `$MPICH_NEMESIS_ASYNC_PROGRESS`. E.g. 1421*9b92b1d3SBarry Smith ```console 1422*9b92b1d3SBarry Smith $ export MPICH_MAX_THREAD_SAFETY=multiple 1423*9b92b1d3SBarry Smith $ export MPICH_ASYNC_PROGRESS=1 1424*9b92b1d3SBarry Smith $ export MPICH_NEMESIS_ASYNC_PROGRESS=1 1425*9b92b1d3SBarry Smith ``` 1426*9b92b1d3SBarry Smith 1427*9b92b1d3SBarry Smith- MPICH version 3.0 and later implements the MPI-3 standard and the default 1428*9b92b1d3SBarry Smith configuration supports use of threads. Use of a progress thread is configured by 1429*9b92b1d3SBarry Smith setting the environment variable `$MPICH_ASYNC_PROGRESS`. E.g. 1430*9b92b1d3SBarry Smith ```console 1431*9b92b1d3SBarry Smith $ export MPICH_ASYNC_PROGRESS=1 1432*9b92b1d3SBarry Smith ``` 1433*9b92b1d3SBarry Smith 1434*9b92b1d3SBarry Smith### When using PETSc in single precision mode (`--with-precision=single` when running `configure`) are the operations done in single or double precision? 1435*9b92b1d3SBarry Smith 1436*9b92b1d3SBarry SmithPETSc does **NOT** do any explicit conversion of single precision to double before 1437*9b92b1d3SBarry Smithperforming computations; it depends on the hardware and compiler for what happens. For 1438*9b92b1d3SBarry Smithexample, the compiler could choose to put the single precision numbers into the usual 1439*9b92b1d3SBarry Smithdouble precision registers and then use the usual double precision floating point unit. Or 1440*9b92b1d3SBarry Smithit could use SSE2 instructions that work directly on the single precision numbers. It is a 1441*9b92b1d3SBarry Smithbit of a mystery what decisions get made sometimes. There may be compiler flags in some 1442*9b92b1d3SBarry Smithcircumstances that can affect this. 1443*9b92b1d3SBarry Smith 1444*9b92b1d3SBarry Smith### Why is Newton's method (SNES) not converging, or converges slowly? 1445*9b92b1d3SBarry Smith 1446*9b92b1d3SBarry SmithNewton's method may not converge for many reasons, here are some of the most common: 1447*9b92b1d3SBarry Smith 1448*9b92b1d3SBarry Smith- The Jacobian is wrong (or correct in sequential but not in parallel). 1449*9b92b1d3SBarry Smith- The linear system is {ref}`not solved <doc_faq_execution_kspconv>` or is not solved 1450*9b92b1d3SBarry Smith accurately enough. 1451*9b92b1d3SBarry Smith- The Jacobian system has a singularity that the linear solver is not handling. 1452*9b92b1d3SBarry Smith- There is a bug in the function evaluation routine. 1453*9b92b1d3SBarry Smith- The function is not continuous or does not have continuous first derivatives (e.g. phase 1454*9b92b1d3SBarry Smith change or TVD limiters). 1455*9b92b1d3SBarry Smith- The equations may not have a solution (e.g. limit cycle instead of a steady state) or 1456*9b92b1d3SBarry Smith there may be a "hill" between the initial guess and the steady state (e.g. reactants 1457*9b92b1d3SBarry Smith must ignite and burn before reaching a steady state, but the steady-state residual will 1458*9b92b1d3SBarry Smith be larger during combustion). 1459*9b92b1d3SBarry Smith 1460*9b92b1d3SBarry SmithHere are some of the ways to help debug lack of convergence of Newton: 1461*9b92b1d3SBarry Smith 1462*9b92b1d3SBarry Smith- Run on one processor to see if the problem is only in parallel. 1463*9b92b1d3SBarry Smith 1464*9b92b1d3SBarry Smith- Run with `-info` to get more detailed information on the solution process. 1465*9b92b1d3SBarry Smith 1466*9b92b1d3SBarry Smith- Run with the options 1467*9b92b1d3SBarry Smith 1468*9b92b1d3SBarry Smith ```text 1469*9b92b1d3SBarry Smith -snes_monitor -ksp_monitor_true_residual -snes_converged_reason -ksp_converged_reason 1470*9b92b1d3SBarry Smith ``` 1471*9b92b1d3SBarry Smith 1472*9b92b1d3SBarry Smith - If the linear solve does not converge, check if the Jacobian is correct, then see 1473*9b92b1d3SBarry Smith {ref}`this question <doc_faq_execution_kspconv>`. 1474*9b92b1d3SBarry Smith - If the preconditioned residual converges, but the true residual does not, the 1475*9b92b1d3SBarry Smith preconditioner may be singular. 1476*9b92b1d3SBarry Smith - If the linear solve converges well, but the line search fails, the Jacobian may be 1477*9b92b1d3SBarry Smith incorrect. 1478*9b92b1d3SBarry Smith 1479*9b92b1d3SBarry Smith- Run with `-pc_type lu` or `-pc_type svd` to see if the problem is a poor linear 1480*9b92b1d3SBarry Smith solver. 1481*9b92b1d3SBarry Smith 1482*9b92b1d3SBarry Smith- Run with `-mat_view` or `-mat_view draw` to see if the Jacobian looks reasonable. 1483*9b92b1d3SBarry Smith 1484*9b92b1d3SBarry Smith- Run with `-snes_test_jacobian -snes_test_jacobian_view` to see if the Jacobian you are 1485*9b92b1d3SBarry Smith using is wrong. Compare the output when you add `-mat_fd_type ds` to see if the result 1486*9b92b1d3SBarry Smith is sensitive to the choice of differencing parameter. 1487*9b92b1d3SBarry Smith 1488*9b92b1d3SBarry Smith- Run with `-snes_mf_operator -pc_type lu` to see if the Jacobian you are using is 1489*9b92b1d3SBarry Smith wrong. If the problem is too large for a direct solve, try 1490*9b92b1d3SBarry Smith 1491*9b92b1d3SBarry Smith ```text 1492*9b92b1d3SBarry Smith -snes_mf_operator -pc_type ksp -ksp_ksp_rtol 1e-12. 1493*9b92b1d3SBarry Smith ``` 1494*9b92b1d3SBarry Smith 1495*9b92b1d3SBarry Smith Compare the output when you add `-mat_mffd_type ds` to see if the result is sensitive 1496*9b92b1d3SBarry Smith to choice of differencing parameter. 1497*9b92b1d3SBarry Smith 1498*9b92b1d3SBarry Smith- Run with `-snes_linesearch_monitor` to see if the line search is failing (this is 1499*9b92b1d3SBarry Smith usually a sign of a bad Jacobian). Use `-info` in PETSc 3.1 and older versions, 1500*9b92b1d3SBarry Smith `-snes_ls_monitor` in PETSc 3.2 and `-snes_linesearch_monitor` in PETSc 3.3 and 1501*9b92b1d3SBarry Smith later. 1502*9b92b1d3SBarry Smith 1503*9b92b1d3SBarry SmithHere are some ways to help the Newton process if everything above checks out: 1504*9b92b1d3SBarry Smith 1505*9b92b1d3SBarry Smith- Run with grid sequencing (`-snes_grid_sequence` if working with a `DM` is all you 1506*9b92b1d3SBarry Smith need) to generate better initial guess on your finer mesh. 1507*9b92b1d3SBarry Smith 1508*9b92b1d3SBarry Smith- Run with quad precision, i.e. 1509*9b92b1d3SBarry Smith 1510*9b92b1d3SBarry Smith ```console 1511*9b92b1d3SBarry Smith $ ./configure --with-precision=__float128 --download-f2cblaslapack 1512*9b92b1d3SBarry Smith ``` 1513*9b92b1d3SBarry Smith 1514*9b92b1d3SBarry Smith :::{note} 1515*9b92b1d3SBarry Smith quad precision requires PETSc 3.2 and later and recent versions of the GNU compilers. 1516*9b92b1d3SBarry Smith ::: 1517*9b92b1d3SBarry Smith 1518*9b92b1d3SBarry Smith- Change the units (nondimensionalization), boundary condition scaling, or formulation so 1519*9b92b1d3SBarry Smith that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and 1520*9b92b1d3SBarry Smith Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127). 1521*9b92b1d3SBarry Smith 1522*9b92b1d3SBarry Smith- Mollify features in the function that do not have continuous first derivatives (often 1523*9b92b1d3SBarry Smith occurs when there are "if" statements in the residual evaluation, e.g. phase change or 1524*9b92b1d3SBarry Smith TVD limiters). Use a variational inequality solver (`SNESVINEWTONRSLS`) if the 1525*9b92b1d3SBarry Smith discontinuities are of fundamental importance. 1526*9b92b1d3SBarry Smith 1527*9b92b1d3SBarry Smith- Try a trust region method (`-ts_type tr`, may have to adjust parameters). 1528*9b92b1d3SBarry Smith 1529*9b92b1d3SBarry Smith- Run with some continuation parameter from a point where you know the solution, see 1530*9b92b1d3SBarry Smith `TSPSEUDO` for steady-states. 1531*9b92b1d3SBarry Smith 1532*9b92b1d3SBarry Smith- There are homotopy solver packages like PHCpack that can get you all possible solutions 1533*9b92b1d3SBarry Smith (and tell you that it has found them all) but those are not scalable and cannot solve 1534*9b92b1d3SBarry Smith anything but small problems. 1535*9b92b1d3SBarry Smith 1536*9b92b1d3SBarry Smith(doc_faq_execution_kspconv)= 1537*9b92b1d3SBarry Smith 1538*9b92b1d3SBarry Smith### Why is the linear solver (KSP) not converging, or converges slowly? 1539*9b92b1d3SBarry Smith 1540*9b92b1d3SBarry Smith:::{tip} 1541*9b92b1d3SBarry SmithAlways run with `-ksp_converged_reason -ksp_monitor_true_residual` when trying to 1542*9b92b1d3SBarry Smithlearn why a method is not converging! 1543*9b92b1d3SBarry Smith::: 1544*9b92b1d3SBarry Smith 1545*9b92b1d3SBarry SmithCommon reasons for KSP not converging are: 1546*9b92b1d3SBarry Smith 1547*9b92b1d3SBarry Smith- A symmetric method is being used for a non-symmetric problem. 1548*9b92b1d3SBarry Smith 1549*9b92b1d3SBarry Smith- The equations are singular by accident (e.g. forgot to impose boundary 1550*9b92b1d3SBarry Smith conditions). Check this for a small problem using `-pc_type svd -pc_svd_monitor`. 1551*9b92b1d3SBarry Smith 1552*9b92b1d3SBarry Smith- The equations are intentionally singular (e.g. constant null space), but the Krylov 1553*9b92b1d3SBarry Smith method was not informed, see `MatSetNullSpace()`. Always inform your local Krylov 1554*9b92b1d3SBarry Smith subspace solver of any change of singularity. Failure to do so will result in the 1555*9b92b1d3SBarry Smith immediate revocation of your computing and keyboard operator licenses, as well as 1556*9b92b1d3SBarry Smith a stern talking-to by the nearest Krylov Subspace Method representative. 1557*9b92b1d3SBarry Smith 1558*9b92b1d3SBarry Smith- The equations are intentionally singular and `MatSetNullSpace()` was used, but the 1559*9b92b1d3SBarry Smith right-hand side is not consistent. You may have to call `MatNullSpaceRemove()` on the 1560*9b92b1d3SBarry Smith right-hand side before calling `KSPSolve()`. See `MatSetTransposeNullSpace()`. 1561*9b92b1d3SBarry Smith 1562*9b92b1d3SBarry Smith- The equations are indefinite so that standard preconditioners don't work. Usually you 1563*9b92b1d3SBarry Smith will know this from the physics, but you can check with 1564*9b92b1d3SBarry Smith 1565*9b92b1d3SBarry Smith ```text 1566*9b92b1d3SBarry Smith -ksp_compute_eigenvalues -ksp_gmres_restart 1000 -pc_type none 1567*9b92b1d3SBarry Smith ``` 1568*9b92b1d3SBarry Smith 1569*9b92b1d3SBarry Smith For simple saddle point problems, try 1570*9b92b1d3SBarry Smith 1571*9b92b1d3SBarry Smith ```text 1572*9b92b1d3SBarry Smith -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_detect_saddle_point 1573*9b92b1d3SBarry Smith ``` 1574*9b92b1d3SBarry Smith 1575*9b92b1d3SBarry Smith For more difficult problems, read the literature to find robust methods and ask 1576*9b92b1d3SBarry Smith <mailto:petsc-users@mcs.anl.gov> or <mailto:petsc-maint@mcs.anl.gov> if you want advice about how to 1577*9b92b1d3SBarry Smith implement them. 1578*9b92b1d3SBarry Smith 1579*9b92b1d3SBarry Smith- If the method converges in preconditioned residual, but not in true residual, the 1580*9b92b1d3SBarry Smith preconditioner is likely singular or nearly so. This is common for saddle point problems 1581*9b92b1d3SBarry Smith (e.g. incompressible flow) or strongly nonsymmetric operators (e.g. low-Mach hyperbolic 1582*9b92b1d3SBarry Smith problems with large time steps). 1583*9b92b1d3SBarry Smith 1584*9b92b1d3SBarry Smith- The preconditioner is too weak or is unstable. See if `-pc_type asm -sub_pc_type lu` 1585*9b92b1d3SBarry Smith improves the convergence rate. If GMRES is losing too much progress in the restart, see 1586*9b92b1d3SBarry Smith if longer restarts help `-ksp_gmres_restart 300`. If a transpose is available, try 1587*9b92b1d3SBarry Smith `-ksp_type bcgs` or other methods that do not require a restart. 1588*9b92b1d3SBarry Smith 1589*9b92b1d3SBarry Smith :::{note} 1590*9b92b1d3SBarry Smith Unfortunately convergence with these methods is frequently erratic. 1591*9b92b1d3SBarry Smith ::: 1592*9b92b1d3SBarry Smith 1593*9b92b1d3SBarry Smith- The preconditioner is nonlinear (e.g. a nested iterative solve), try `-ksp_type 1594*9b92b1d3SBarry Smith fgmres` or `-ksp_type gcr`. 1595*9b92b1d3SBarry Smith 1596*9b92b1d3SBarry Smith- You are using geometric multigrid, but some equations (often boundary conditions) are 1597*9b92b1d3SBarry Smith not scaled compatibly between levels. Try `-pc_mg_galerkin` both to algebraically 1598*9b92b1d3SBarry Smith construct a correctly scaled coarse operator or make sure that all the equations are 1599*9b92b1d3SBarry Smith scaled in the same way if you want to use rediscretized coarse levels. 1600*9b92b1d3SBarry Smith 1601*9b92b1d3SBarry Smith- The matrix is very ill-conditioned. Check the {ref}`condition number <doc_faq_usage_condnum>`. 1602*9b92b1d3SBarry Smith 1603*9b92b1d3SBarry Smith - Try to improve it by choosing the relative scaling of components/boundary conditions. 1604*9b92b1d3SBarry Smith - Try `-ksp_diagonal_scale -ksp_diagonal_scale_fix`. 1605*9b92b1d3SBarry Smith - Perhaps change the formulation of the problem to produce more friendly algebraic 1606*9b92b1d3SBarry Smith equations. 1607*9b92b1d3SBarry Smith 1608*9b92b1d3SBarry Smith- Change the units (nondimensionalization), boundary condition scaling, or formulation so 1609*9b92b1d3SBarry Smith that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and 1610*9b92b1d3SBarry Smith Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127). 1611*9b92b1d3SBarry Smith 1612*9b92b1d3SBarry Smith- Classical Gram-Schmidt is becoming unstable, try `-ksp_gmres_modifiedgramschmidt` or 1613*9b92b1d3SBarry Smith use a method that orthogonalizes differently, e.g. `-ksp_type gcr`. 1614*9b92b1d3SBarry Smith 1615*9b92b1d3SBarry Smith### I get the error message: Actual argument at (1) to assumed-type dummy is of derived type with type-bound or FINAL procedures 1616*9b92b1d3SBarry Smith 1617*9b92b1d3SBarry SmithUse the following code-snippet: 1618*9b92b1d3SBarry Smith 1619*9b92b1d3SBarry Smith```fortran 1620*9b92b1d3SBarry Smithmodule context_module 1621*9b92b1d3SBarry Smith#include petsc/finclude/petsc.h 1622*9b92b1d3SBarry Smithuse petsc 1623*9b92b1d3SBarry Smithimplicit none 1624*9b92b1d3SBarry Smithprivate 1625*9b92b1d3SBarry Smithtype, public :: context_type 1626*9b92b1d3SBarry Smith private 1627*9b92b1d3SBarry Smith PetscInt :: foo 1628*9b92b1d3SBarry Smithcontains 1629*9b92b1d3SBarry Smith procedure, public :: init => context_init 1630*9b92b1d3SBarry Smithend type context_type 1631*9b92b1d3SBarry Smithcontains 1632*9b92b1d3SBarry Smithsubroutine context_init(self, foo) 1633*9b92b1d3SBarry Smith class(context_type), intent(in out) :: self 1634*9b92b1d3SBarry Smith PetscInt, intent(in) :: foo 1635*9b92b1d3SBarry Smith self%foo = foo 1636*9b92b1d3SBarry Smithend subroutine context_init 1637*9b92b1d3SBarry Smithend module context_module 1638*9b92b1d3SBarry Smith 1639*9b92b1d3SBarry Smith!------------------------------------------------------------------------ 1640*9b92b1d3SBarry Smith 1641*9b92b1d3SBarry Smithprogram test_snes 1642*9b92b1d3SBarry Smithuse,intrinsic :: iso_c_binding 1643*9b92b1d3SBarry Smithuse petsc 1644*9b92b1d3SBarry Smithuse context_module 1645*9b92b1d3SBarry Smithimplicit none 1646*9b92b1d3SBarry Smith 1647*9b92b1d3SBarry SmithSNES :: snes 1648*9b92b1d3SBarry Smithtype(context_type),target :: context 1649*9b92b1d3SBarry Smithtype(c_ptr) :: contextOut 1650*9b92b1d3SBarry SmithPetscErrorCode :: ierr 1651*9b92b1d3SBarry Smith 1652*9b92b1d3SBarry Smithcall PetscInitialize(PETSC_NULL_CHARACTER, ierr) 1653*9b92b1d3SBarry Smithcall SNESCreate(PETSC_COMM_WORLD, snes, ierr) 1654*9b92b1d3SBarry Smithcall context%init(1) 1655*9b92b1d3SBarry Smith 1656*9b92b1d3SBarry SmithcontextOut = c_loc(context) ! contextOut is a C pointer on context 1657*9b92b1d3SBarry Smith 1658*9b92b1d3SBarry Smithcall SNESSetConvergenceTest(snes, convergence, contextOut, PETSC_NULL_FUNCTION, ierr) 1659*9b92b1d3SBarry Smith 1660*9b92b1d3SBarry Smithcall SNESDestroy(snes, ierr) 1661*9b92b1d3SBarry Smithcall PetscFinalize(ierr) 1662*9b92b1d3SBarry Smith 1663*9b92b1d3SBarry Smithcontains 1664*9b92b1d3SBarry Smith 1665*9b92b1d3SBarry Smithsubroutine convergence(snes, num_iterations, xnorm, pnorm,fnorm, reason, contextIn, ierr) 1666*9b92b1d3SBarry SmithSNES, intent(in) :: snes 1667*9b92b1d3SBarry Smith 1668*9b92b1d3SBarry SmithPetscInt, intent(in) :: num_iterations 1669*9b92b1d3SBarry SmithPetscReal, intent(in) :: xnorm, pnorm, fnorm 1670*9b92b1d3SBarry SmithSNESConvergedReason, intent(out) :: reason 1671*9b92b1d3SBarry Smithtype(c_ptr), intent(in out) :: contextIn 1672*9b92b1d3SBarry Smithtype(context_type), pointer :: context 1673*9b92b1d3SBarry SmithPetscErrorCode, intent(out) :: ierr 1674*9b92b1d3SBarry Smith 1675*9b92b1d3SBarry Smithcall c_f_pointer(contextIn,context) ! convert the C pointer to a Fortran pointer to use context as in the main program 1676*9b92b1d3SBarry Smithreason = 0 1677*9b92b1d3SBarry Smithierr = 0 1678*9b92b1d3SBarry Smithend subroutine convergence 1679*9b92b1d3SBarry Smithend program test_snes 1680*9b92b1d3SBarry Smith``` 1681*9b92b1d3SBarry Smith 1682*9b92b1d3SBarry Smith### In C++ I get a crash on `VecDestroy()` (or some other PETSc object) at the end of the program 1683*9b92b1d3SBarry Smith 1684*9b92b1d3SBarry SmithThis can happen when the destructor for a C++ class is automatically called at the end of 1685*9b92b1d3SBarry Smiththe program after `PetscFinalize()`. Use the following code-snippet: 1686*9b92b1d3SBarry Smith 1687*9b92b1d3SBarry Smith``` 1688*9b92b1d3SBarry Smithmain() 1689*9b92b1d3SBarry Smith{ 1690*9b92b1d3SBarry Smith PetscCall(PetscInitialize()); 1691*9b92b1d3SBarry Smith { 1692*9b92b1d3SBarry Smith your variables 1693*9b92b1d3SBarry Smith your code 1694*9b92b1d3SBarry Smith 1695*9b92b1d3SBarry Smith ... /* all your destructors are called here automatically by C++ so they work correctly */ 1696*9b92b1d3SBarry Smith } 1697*9b92b1d3SBarry Smith PetscCall(PetscFinalize()); 1698*9b92b1d3SBarry Smith return 0 1699*9b92b1d3SBarry Smith} 1700*9b92b1d3SBarry Smith``` 1701*9b92b1d3SBarry Smith 1702*9b92b1d3SBarry Smith______________________________________________________________________ 1703*9b92b1d3SBarry Smith 1704*9b92b1d3SBarry Smith## Debugging 1705*9b92b1d3SBarry Smith 1706*9b92b1d3SBarry Smith### What does the message hwloc/linux: Ignoring PCI device with non-16bit domain mean? 1707*9b92b1d3SBarry Smith 1708*9b92b1d3SBarry SmithThis is printed by the hwloc library that is used by some MPI implementations. It can be ignored. 1709*9b92b1d3SBarry SmithTo prevent the message from always being printed set the environmental variable `HWLOC_HIDE_ERRORS` to 2. 1710*9b92b1d3SBarry SmithFor example 1711*9b92b1d3SBarry Smith 1712*9b92b1d3SBarry Smith``` 1713*9b92b1d3SBarry Smithexport HWLOC_HIDE_ERRORS=2 1714*9b92b1d3SBarry Smith``` 1715*9b92b1d3SBarry Smith 1716*9b92b1d3SBarry Smithwhich can be added to your login profile file such as `~/.bashrc`. 1717*9b92b1d3SBarry Smith 1718*9b92b1d3SBarry Smith### How do I turn off PETSc signal handling so I can use the `-C` Option On `xlf`? 1719*9b92b1d3SBarry Smith 1720*9b92b1d3SBarry SmithImmediately after calling `PetscInitialize()` call `PetscPopSignalHandler()`. 1721*9b92b1d3SBarry Smith 1722*9b92b1d3SBarry SmithSome Fortran compilers including the IBM xlf, xlF etc compilers have a compile option 1723*9b92b1d3SBarry Smith(`-C` for IBM's) that causes all array access in Fortran to be checked that they are 1724*9b92b1d3SBarry Smithin-bounds. This is a great feature but does require that the array dimensions be set 1725*9b92b1d3SBarry Smithexplicitly, not with a \*. 1726*9b92b1d3SBarry Smith 1727*9b92b1d3SBarry Smith### How do I debug if `-start_in_debugger` does not work on my machine? 1728*9b92b1d3SBarry Smith 1729*9b92b1d3SBarry SmithThe script <https://github.com/Azrael3000/tmpi> can be used to run multiple MPI 1730*9b92b1d3SBarry Smithranks in the debugger using tmux. 1731*9b92b1d3SBarry Smith 1732*9b92b1d3SBarry SmithOn newer macOS machines - one has to be in admin group to be able to use debugger. 1733*9b92b1d3SBarry Smith 1734*9b92b1d3SBarry SmithOn newer Ubuntu linux machines - one has to disable `ptrace_scope` with 1735*9b92b1d3SBarry Smith 1736*9b92b1d3SBarry Smith```console 1737*9b92b1d3SBarry Smith$ sudo echo 0 > /proc/sys/kernel/yama/ptrace_scope 1738*9b92b1d3SBarry Smith``` 1739*9b92b1d3SBarry Smith 1740*9b92b1d3SBarry Smithto get start in debugger working. 1741*9b92b1d3SBarry Smith 1742*9b92b1d3SBarry SmithIf `-start_in_debugger` does not work on your OS, for a uniprocessor job, just 1743*9b92b1d3SBarry Smithtry the debugger directly, for example: `gdb ex1`. You can also use [TotalView](https://totalview.io/products/totalview) which is a good graphical parallel debugger. 1744*9b92b1d3SBarry Smith 1745*9b92b1d3SBarry Smith### How do I see where my code is hanging? 1746*9b92b1d3SBarry Smith 1747*9b92b1d3SBarry SmithYou can use the `-start_in_debugger` option to start all processes in the debugger (each 1748*9b92b1d3SBarry Smithwill come up in its own xterm) or run in [TotalView](https://totalview.io/products/totalview). Then use `cont` (for continue) in each 1749*9b92b1d3SBarry Smithxterm. Once you are sure that the program is hanging, hit control-c in each xterm and then 1750*9b92b1d3SBarry Smithuse 'where' to print a stack trace for each process. 1751*9b92b1d3SBarry Smith 1752*9b92b1d3SBarry Smith### How can I inspect PETSc vector and matrix values when in the debugger? 1753*9b92b1d3SBarry Smith 1754*9b92b1d3SBarry SmithI will illustrate this with `gdb`, but it should be similar on other debuggers. You can 1755*9b92b1d3SBarry Smithlook at local `Vec` values directly by obtaining the array. For a `Vec` v, we can 1756*9b92b1d3SBarry Smithprint all local values using: 1757*9b92b1d3SBarry Smith 1758*9b92b1d3SBarry Smith```console 1759*9b92b1d3SBarry Smith(gdb) p ((Vec_Seq*) v->data)->array[0]@v->map.n 1760*9b92b1d3SBarry Smith``` 1761*9b92b1d3SBarry Smith 1762*9b92b1d3SBarry SmithHowever, this becomes much more complicated for a matrix. Therefore, it is advisable to use the default viewer to look at the object. For a `Vec` v and a `Mat` m, this would be: 1763*9b92b1d3SBarry Smith 1764*9b92b1d3SBarry Smith```console 1765*9b92b1d3SBarry Smith(gdb) call VecView(v, 0) 1766*9b92b1d3SBarry Smith(gdb) call MatView(m, 0) 1767*9b92b1d3SBarry Smith``` 1768*9b92b1d3SBarry Smith 1769*9b92b1d3SBarry Smithor with a communicator other than `MPI_COMM_WORLD`: 1770*9b92b1d3SBarry Smith 1771*9b92b1d3SBarry Smith```console 1772*9b92b1d3SBarry Smith(gdb) call MatView(m, PETSC_VIEWER_STDOUT_(m->comm)) 1773*9b92b1d3SBarry Smith``` 1774*9b92b1d3SBarry Smith 1775*9b92b1d3SBarry SmithTotalview 8.8.0+ has a new feature that allows libraries to provide their own code to 1776*9b92b1d3SBarry Smithdisplay objects in the debugger. Thus in theory each PETSc object, `Vec`, `Mat` etc 1777*9b92b1d3SBarry Smithcould have custom code to print values in the object. We have only done this for the most 1778*9b92b1d3SBarry Smithelementary display of `Vec` and `Mat`. See the routine `TV_display_type()` in 1779*9b92b1d3SBarry Smith`src/vec/vec/interface/vector.c` for an example of how these may be written. Contact us 1780*9b92b1d3SBarry Smithif you would like to add more. 1781*9b92b1d3SBarry Smith 1782*9b92b1d3SBarry Smith### How can I find the cause of floating point exceptions like not-a-number (NaN) or infinity? 1783*9b92b1d3SBarry Smith 1784*9b92b1d3SBarry SmithThe best way to locate floating point exceptions is to use a debugger. On supported 1785*9b92b1d3SBarry Smitharchitectures (including Linux and glibc-based systems), just run in a debugger and pass 1786*9b92b1d3SBarry Smith`-fp_trap` to the PETSc application. This will activate signaling exceptions and the 1787*9b92b1d3SBarry Smithdebugger will break on the line that first divides by zero or otherwise generates an 1788*9b92b1d3SBarry Smithexceptions. 1789*9b92b1d3SBarry Smith 1790*9b92b1d3SBarry SmithWithout a debugger, running with `-fp_trap` in debug mode will only identify the 1791*9b92b1d3SBarry Smithfunction in which the error occurred, but not the line or the type of exception. If 1792*9b92b1d3SBarry Smith`-fp_trap` is not supported on your architecture, consult the documentation for your 1793*9b92b1d3SBarry Smithdebugger since there is likely a way to have it catch exceptions. 1794*9b92b1d3SBarry Smith 1795*9b92b1d3SBarry Smith(error_libimf)= 1796*9b92b1d3SBarry Smith 1797*9b92b1d3SBarry Smith### Error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory 1798*9b92b1d3SBarry Smith 1799*9b92b1d3SBarry SmithThe Intel compilers use shared libraries (like libimf) that cannot be found, by default, at run 1800*9b92b1d3SBarry Smithtime. When using the Intel compilers (and running the resulting code) you must make sure 1801*9b92b1d3SBarry Smiththat the proper Intel initialization scripts are run. This is usually done by adding some 1802*9b92b1d3SBarry Smithcode into your `.cshrc`, `.bashrc`, `.profile` etc file. Sometimes on batch file 1803*9b92b1d3SBarry Smithsystems that do now access your initialization files (like .cshrc) you must include the 1804*9b92b1d3SBarry Smithinitialization calls in your batch file submission. 1805*9b92b1d3SBarry Smith 1806*9b92b1d3SBarry SmithFor example, on my Mac using `csh` I have the following in my `.cshrc` file: 1807*9b92b1d3SBarry Smith 1808*9b92b1d3SBarry Smith```csh 1809*9b92b1d3SBarry Smithsource /opt/intel/cc/10.1.012/bin/iccvars.csh 1810*9b92b1d3SBarry Smithsource /opt/intel/fc/10.1.012/bin/ifortvars.csh 1811*9b92b1d3SBarry Smithsource /opt/intel/idb/10.1.012/bin/idbvars.csh 1812*9b92b1d3SBarry Smith``` 1813*9b92b1d3SBarry Smith 1814*9b92b1d3SBarry SmithAnd in my `.profile` I have 1815*9b92b1d3SBarry Smith 1816*9b92b1d3SBarry Smith```csh 1817*9b92b1d3SBarry Smithsource /opt/intel/cc/10.1.012/bin/iccvars.sh 1818*9b92b1d3SBarry Smithsource /opt/intel/fc/10.1.012/bin/ifortvars.sh 1819*9b92b1d3SBarry Smithsource /opt/intel/idb/10.1.012/bin/idbvars.sh 1820*9b92b1d3SBarry Smith``` 1821*9b92b1d3SBarry Smith 1822*9b92b1d3SBarry Smith(object_type_not_set)= 1823*9b92b1d3SBarry Smith 1824*9b92b1d3SBarry Smith### What does "Object Type Not Set: Argument # N" Mean? 1825*9b92b1d3SBarry Smith 1826*9b92b1d3SBarry SmithMany operations on PETSc objects require that the specific type of the object be set before the operations is performed. You must call `XXXSetType()` or `XXXSetFromOptions()` before you make the offending call. For example 1827*9b92b1d3SBarry Smith 1828*9b92b1d3SBarry Smith``` 1829*9b92b1d3SBarry SmithMat A; 1830*9b92b1d3SBarry Smith 1831*9b92b1d3SBarry SmithPetscCall(MatCreate(PETSC_COMM_WORLD, &A)); 1832*9b92b1d3SBarry SmithPetscCall(MatSetValues(A,....)); 1833*9b92b1d3SBarry Smith``` 1834*9b92b1d3SBarry Smith 1835*9b92b1d3SBarry Smithwill not work. You must add `MatSetType()` or `MatSetFromOptions()` before the call to `MatSetValues()`. I.e. 1836*9b92b1d3SBarry Smith 1837*9b92b1d3SBarry Smith``` 1838*9b92b1d3SBarry SmithMat A; 1839*9b92b1d3SBarry Smith 1840*9b92b1d3SBarry SmithPetscCall(MatCreate(PETSC_COMM_WORLD, &A)); 1841*9b92b1d3SBarry Smith 1842*9b92b1d3SBarry SmithPetscCall(MatSetType(A, MATAIJ)); 1843*9b92b1d3SBarry Smith/* Will override MatSetType() */ 1844*9b92b1d3SBarry SmithPetscCall(MatSetFromOptions()); 1845*9b92b1d3SBarry Smith 1846*9b92b1d3SBarry SmithPetscCall(MatSetValues(A,....)); 1847*9b92b1d3SBarry Smith``` 1848*9b92b1d3SBarry Smith 1849*9b92b1d3SBarry Smith(split_ownership)= 1850*9b92b1d3SBarry Smith 1851*9b92b1d3SBarry Smith### What does error detected in `PetscSplitOwnership()` about "sum of local lengths ...": mean? 1852*9b92b1d3SBarry Smith 1853*9b92b1d3SBarry SmithIn a previous call to `VecSetSizes()`, `MatSetSizes()`, `VecCreateXXX()` or 1854*9b92b1d3SBarry Smith`MatCreateXXX()` you passed in local and global sizes that do not make sense for the 1855*9b92b1d3SBarry Smithcorrect number of processors. For example if you pass in a local size of 2 and a global 1856*9b92b1d3SBarry Smithsize of 100 and run on two processors, this cannot work since the sum of the local sizes 1857*9b92b1d3SBarry Smithis 4, not 100. 1858*9b92b1d3SBarry Smith 1859*9b92b1d3SBarry Smith(valgrind)= 1860*9b92b1d3SBarry Smith 1861*9b92b1d3SBarry Smith### What does "corrupt argument" or "caught signal" Or "SEGV" Or "segmentation violation" Or "bus error" mean? Can I use Valgrind or CUDA-Memcheck to debug memory corruption issues? 1862*9b92b1d3SBarry Smith 1863*9b92b1d3SBarry SmithSometimes it can mean an argument to a function is invalid. In Fortran this may be caused 1864*9b92b1d3SBarry Smithby forgetting to list an argument in the call, especially the final `PetscErrorCode`. 1865*9b92b1d3SBarry Smith 1866*9b92b1d3SBarry SmithOtherwise it is usually caused by memory corruption; that is somewhere the code is writing 1867*9b92b1d3SBarry Smithout of array bounds. To track this down rerun the debug version of the code with the 1868*9b92b1d3SBarry Smithoption `-malloc_debug`. Occasionally the code may crash only with the optimized version, 1869*9b92b1d3SBarry Smithin that case run the optimized version with `-malloc_debug`. If you determine the 1870*9b92b1d3SBarry Smithproblem is from memory corruption you can put the macro CHKMEMQ in the code near the crash 1871*9b92b1d3SBarry Smithto determine exactly what line is causing the problem. 1872*9b92b1d3SBarry Smith 1873*9b92b1d3SBarry SmithIf `-malloc_debug` does not help: on NVIDIA CUDA systems you can use <https://docs.nvidia.com/cuda/cuda-memcheck/index.html> 1874*9b92b1d3SBarry Smith 1875*9b92b1d3SBarry SmithIf `-malloc_debug` does not help: on GNU/Linux (not macOS machines) - you can 1876*9b92b1d3SBarry Smithuse [valgrind](http://valgrind.org). Follow the below instructions: 1877*9b92b1d3SBarry Smith 1878*9b92b1d3SBarry Smith1. `configure` PETSc with `--download-mpich --with-debugging` (you can use other MPI implementations but most produce spurious Valgrind messages) 1879*9b92b1d3SBarry Smith 1880*9b92b1d3SBarry Smith2. Compile your application code with this build of PETSc. 1881*9b92b1d3SBarry Smith 1882*9b92b1d3SBarry Smith3. Run with Valgrind. 1883*9b92b1d3SBarry Smith 1884*9b92b1d3SBarry Smith ```console 1885*9b92b1d3SBarry Smith $ $PETSC_DIR/lib/petsc/bin/petscmpiexec -valgrind -n NPROC PETSCPROGRAMNAME PROGRAMOPTIONS 1886*9b92b1d3SBarry Smith ``` 1887*9b92b1d3SBarry Smith 1888*9b92b1d3SBarry Smith or 1889*9b92b1d3SBarry Smith 1890*9b92b1d3SBarry Smith ```console 1891*9b92b1d3SBarry Smith $ mpiexec -n NPROC valgrind --tool=memcheck -q --num-callers=20 \ 1892*9b92b1d3SBarry Smith --suppressions=$PETSC_DIR/share/petsc/suppressions/valgrind \ 1893*9b92b1d3SBarry Smith --log-file=valgrind.log.%p PETSCPROGRAMNAME -malloc off PROGRAMOPTIONS 1894*9b92b1d3SBarry Smith ``` 1895*9b92b1d3SBarry Smith 1896*9b92b1d3SBarry Smith:::{note} 1897*9b92b1d3SBarry Smith- option `--with-debugging` enables valgrind to give stack trace with additional 1898*9b92b1d3SBarry Smith source-file:line-number info. 1899*9b92b1d3SBarry Smith- option `--download-mpich` is valgrind clean, other MPI builds are not valgrind clean. 1900*9b92b1d3SBarry Smith- when `--download-mpich` is used - mpiexec will be in `$PETSC_ARCH/bin` 1901*9b92b1d3SBarry Smith- `--log-file=valgrind.log.%p` option tells valgrind to store the output from each 1902*9b92b1d3SBarry Smith process in a different file \[as %p i.e PID, is different for each MPI process. 1903*9b92b1d3SBarry Smith- `memcheck` will not find certain array access that violate static array 1904*9b92b1d3SBarry Smith declarations so if memcheck runs clean you can try the `--tool=exp-ptrcheck` 1905*9b92b1d3SBarry Smith instead. 1906*9b92b1d3SBarry Smith::: 1907*9b92b1d3SBarry Smith 1908*9b92b1d3SBarry SmithYou might also consider using <http://drmemory.org> which has support for GNU/Linux, Apple 1909*9b92b1d3SBarry SmithMac OS and Microsoft Windows machines. (Note we haven't tried this ourselves). 1910*9b92b1d3SBarry Smith 1911*9b92b1d3SBarry Smith(zeropivot)= 1912*9b92b1d3SBarry Smith 1913*9b92b1d3SBarry Smith### What does "detected zero pivot in LU factorization" mean? 1914*9b92b1d3SBarry Smith 1915*9b92b1d3SBarry SmithA zero pivot in LU, ILU, Cholesky, or ICC sparse factorization does not always mean that 1916*9b92b1d3SBarry Smiththe matrix is singular. You can use 1917*9b92b1d3SBarry Smith 1918*9b92b1d3SBarry Smith```text 1919*9b92b1d3SBarry Smith-pc_factor_shift_type nonzero -pc_factor_shift_amount [amount] 1920*9b92b1d3SBarry Smith``` 1921*9b92b1d3SBarry Smith 1922*9b92b1d3SBarry Smithor 1923*9b92b1d3SBarry Smith 1924*9b92b1d3SBarry Smith```text 1925*9b92b1d3SBarry Smith-pc_factor_shift_type positive_definite -[level]_pc_factor_shift_type nonzero 1926*9b92b1d3SBarry Smith -pc_factor_shift_amount [amount] 1927*9b92b1d3SBarry Smith``` 1928*9b92b1d3SBarry Smith 1929*9b92b1d3SBarry Smithor 1930*9b92b1d3SBarry Smith 1931*9b92b1d3SBarry Smith```text 1932*9b92b1d3SBarry Smith-[level]_pc_factor_shift_type positive_definite 1933*9b92b1d3SBarry Smith``` 1934*9b92b1d3SBarry Smith 1935*9b92b1d3SBarry Smithto prevent the zero pivot. [level] is "sub" when lu, ilu, cholesky, or icc are employed in 1936*9b92b1d3SBarry Smitheach individual block of the bjacobi or ASM preconditioner. [level] is "mg_levels" or 1937*9b92b1d3SBarry Smith"mg_coarse" when lu, ilu, cholesky, or icc are used inside multigrid smoothers or to the 1938*9b92b1d3SBarry Smithcoarse grid solver. See `PCFactorSetShiftType()`, `PCFactorSetShiftAmount()`. 1939*9b92b1d3SBarry Smith 1940*9b92b1d3SBarry SmithThis error can also happen if your matrix is singular, see `MatSetNullSpace()` for how 1941*9b92b1d3SBarry Smithto handle this. If this error occurs in the zeroth row of the matrix, it is likely you 1942*9b92b1d3SBarry Smithhave an error in the code that generates the matrix. 1943*9b92b1d3SBarry Smith 1944*9b92b1d3SBarry Smith### You create draw windows or `PETSCVIEWERDRAW` windows or use options `-ksp_monitor draw::draw_lg` or `-snes_monitor draw::draw_lg` and the program seems to run OK but windows never open 1945*9b92b1d3SBarry Smith 1946*9b92b1d3SBarry SmithThe libraries were compiled without support for X windows. Make sure that `configure` 1947*9b92b1d3SBarry Smithwas run with the option `--with-x`. 1948*9b92b1d3SBarry Smith 1949*9b92b1d3SBarry Smith### The program seems to use more and more memory as it runs, even though you don't think you are allocating more memory 1950*9b92b1d3SBarry Smith 1951*9b92b1d3SBarry SmithSome of the following may be occurring: 1952*9b92b1d3SBarry Smith 1953*9b92b1d3SBarry Smith- You are creating new PETSc objects but never freeing them. 1954*9b92b1d3SBarry Smith- There is a memory leak in PETSc or your code. 1955*9b92b1d3SBarry Smith- Something much more subtle: (if you are using Fortran). When you declare a large array 1956*9b92b1d3SBarry Smith in Fortran, the operating system does not allocate all the memory pages for that array 1957*9b92b1d3SBarry Smith until you start using the different locations in the array. Thus, in a code, if at each 1958*9b92b1d3SBarry Smith step you start using later values in the array your virtual memory usage will "continue" 1959*9b92b1d3SBarry Smith to increase as measured by `ps` or `top`. 1960*9b92b1d3SBarry Smith- You are running with the `-log`, `-log_mpe`, or `-log_all` option. With these 1961*9b92b1d3SBarry Smith options, a great deal of logging information is stored in memory until the conclusion of 1962*9b92b1d3SBarry Smith the run. 1963*9b92b1d3SBarry Smith- You are linking with the MPI profiling libraries; these cause logging of all MPI 1964*9b92b1d3SBarry Smith activities. Another symptom is at the conclusion of the run it may print some message 1965*9b92b1d3SBarry Smith about writing log files. 1966*9b92b1d3SBarry Smith 1967*9b92b1d3SBarry SmithThe following may help: 1968*9b92b1d3SBarry Smith 1969*9b92b1d3SBarry Smith- Run with the `-malloc_debug` option and `-malloc_view`. Or use `PetscMallocDump()` 1970*9b92b1d3SBarry Smith and `PetscMallocView()` sprinkled about your code to track memory that is allocated 1971*9b92b1d3SBarry Smith and not later freed. Use the commands `PetscMallocGetCurrentUsage()` and 1972*9b92b1d3SBarry Smith `PetscMemoryGetCurrentUsage()` to monitor memory allocated and 1973*9b92b1d3SBarry Smith `PetscMallocGetMaximumUsage()` and `PetscMemoryGetMaximumUsage()` for total memory 1974*9b92b1d3SBarry Smith used as the code progresses. 1975*9b92b1d3SBarry Smith- This is just the way Unix works and is harmless. 1976*9b92b1d3SBarry Smith- Do not use the `-log`, `-log_mpe`, or `-log_all` option, or use 1977*9b92b1d3SBarry Smith `PetscLogEventDeactivate()` or `PetscLogEventDeactivateClass()` to turn off logging of 1978*9b92b1d3SBarry Smith specific events. 1979*9b92b1d3SBarry Smith- Make sure you do not link with the MPI profiling libraries. 1980*9b92b1d3SBarry Smith 1981*9b92b1d3SBarry Smith### When calling `MatPartitioningApply()` you get a message "Error! Key 16615 Not Found" 1982*9b92b1d3SBarry Smith 1983*9b92b1d3SBarry SmithThe graph of the matrix you are using is not symmetric. You must use symmetric matrices 1984*9b92b1d3SBarry Smithfor partitioning. 1985*9b92b1d3SBarry Smith 1986*9b92b1d3SBarry Smith### With GMRES at restart the second residual norm printed does not match the first 1987*9b92b1d3SBarry Smith 1988*9b92b1d3SBarry SmithI.e. 1989*9b92b1d3SBarry Smith 1990*9b92b1d3SBarry Smith```text 1991*9b92b1d3SBarry Smith26 KSP Residual norm 3.421544615851e-04 1992*9b92b1d3SBarry Smith27 KSP Residual norm 2.973675659493e-04 1993*9b92b1d3SBarry Smith28 KSP Residual norm 2.588642948270e-04 1994*9b92b1d3SBarry Smith29 KSP Residual norm 2.268190747349e-04 1995*9b92b1d3SBarry Smith30 KSP Residual norm 1.977245964368e-04 1996*9b92b1d3SBarry Smith30 KSP Residual norm 1.994426291979e-04 <----- At restart the residual norm is printed a second time 1997*9b92b1d3SBarry Smith``` 1998*9b92b1d3SBarry Smith 1999*9b92b1d3SBarry SmithThis is actually not surprising! GMRES computes the norm of the residual at each iteration 2000*9b92b1d3SBarry Smithvia a recurrence relation between the norms of the residuals at the previous iterations 2001*9b92b1d3SBarry Smithand quantities computed at the current iteration. It does not compute it via directly 2002*9b92b1d3SBarry Smith$|| b - A x^{n} ||$. 2003*9b92b1d3SBarry Smith 2004*9b92b1d3SBarry SmithSometimes, especially with an ill-conditioned matrix, or computation of the matrix-vector 2005*9b92b1d3SBarry Smithproduct via differencing, the residual norms computed by GMRES start to "drift" from the 2006*9b92b1d3SBarry Smithcorrect values. At the restart, we compute the residual norm directly, hence the "strange 2007*9b92b1d3SBarry Smithstuff," the difference printed. The drifting, if it remains small, is harmless (doesn't 2008*9b92b1d3SBarry Smithaffect the accuracy of the solution that GMRES computes). 2009*9b92b1d3SBarry Smith 2010*9b92b1d3SBarry SmithIf you use a more powerful preconditioner the drift will often be smaller and less 2011*9b92b1d3SBarry Smithnoticeable. Of if you are running matrix-free you may need to tune the matrix-free 2012*9b92b1d3SBarry Smithparameters. 2013*9b92b1d3SBarry Smith 2014*9b92b1d3SBarry Smith### Why do some Krylov methods seem to print two residual norms per iteration? 2015*9b92b1d3SBarry Smith 2016*9b92b1d3SBarry SmithI.e. 2017*9b92b1d3SBarry Smith 2018*9b92b1d3SBarry Smith```text 2019*9b92b1d3SBarry Smith1198 KSP Residual norm 1.366052062216e-04 2020*9b92b1d3SBarry Smith1198 KSP Residual norm 1.931875025549e-04 2021*9b92b1d3SBarry Smith1199 KSP Residual norm 1.366026406067e-04 2022*9b92b1d3SBarry Smith1199 KSP Residual norm 1.931819426344e-04 2023*9b92b1d3SBarry Smith``` 2024*9b92b1d3SBarry Smith 2025*9b92b1d3SBarry SmithSome Krylov methods, for example `KSPTFQMR`, actually have a "sub-iteration" of size 2 2026*9b92b1d3SBarry Smithinside the loop. Each of the two substeps has its own matrix vector product and 2027*9b92b1d3SBarry Smithapplication of the preconditioner and updates the residual approximations. This is why you 2028*9b92b1d3SBarry Smithget this "funny" output where it looks like there are two residual norms per 2029*9b92b1d3SBarry Smithiteration. You can also think of it as twice as many iterations. 2030*9b92b1d3SBarry Smith 2031*9b92b1d3SBarry Smith### Unable to locate PETSc dynamic library `libpetsc` 2032*9b92b1d3SBarry Smith 2033*9b92b1d3SBarry SmithWhen using DYNAMIC libraries - the libraries cannot be moved after they are 2034*9b92b1d3SBarry Smithinstalled. This could also happen on clusters - where the paths are different on the (run) 2035*9b92b1d3SBarry Smithnodes - than on the (compile) front-end. **Do not use dynamic libraries & shared 2036*9b92b1d3SBarry Smithlibraries**. Run `configure` with 2037*9b92b1d3SBarry Smith`--with-shared-libraries=0 --with-dynamic-loading=0`. 2038*9b92b1d3SBarry Smith 2039*9b92b1d3SBarry Smith:::{important} 2040*9b92b1d3SBarry SmithThis option has been removed in petsc v3.5 2041*9b92b1d3SBarry Smith::: 2042*9b92b1d3SBarry Smith 2043*9b92b1d3SBarry Smith### How do I determine what update to PETSc broke my code? 2044*9b92b1d3SBarry Smith 2045*9b92b1d3SBarry SmithIf at some point (in PETSc code history) you had a working code - but the latest PETSc 2046*9b92b1d3SBarry Smithcode broke it, its possible to determine the PETSc code change that might have caused this 2047*9b92b1d3SBarry Smithbehavior. This is achieved by: 2048*9b92b1d3SBarry Smith 2049*9b92b1d3SBarry Smith- Using Git to access PETSc sources 2050*9b92b1d3SBarry Smith- Knowing the Git commit for the known working version of PETSc 2051*9b92b1d3SBarry Smith- Knowing the Git commit for the known broken version of PETSc 2052*9b92b1d3SBarry Smith- Using the [bisect](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html) 2053*9b92b1d3SBarry Smith functionality of Git 2054*9b92b1d3SBarry Smith 2055*9b92b1d3SBarry SmithGit bisect can be done as follows: 2056*9b92b1d3SBarry Smith 2057*9b92b1d3SBarry Smith1. Get PETSc development (main branch in git) sources 2058*9b92b1d3SBarry Smith 2059*9b92b1d3SBarry Smith ```console 2060*9b92b1d3SBarry Smith $ git clone https://gitlab.com/petsc/petsc.git 2061*9b92b1d3SBarry Smith ``` 2062*9b92b1d3SBarry Smith 2063*9b92b1d3SBarry Smith2. Find the good and bad markers to start the bisection process. This can be done either 2064*9b92b1d3SBarry Smith by checking `git log` or `gitk` or <https://gitlab.com/petsc/petsc> or the web 2065*9b92b1d3SBarry Smith history of petsc-release clones. Lets say the known bad commit is 21af4baa815c and 2066*9b92b1d3SBarry Smith known good commit is 5ae5ab319844. 2067*9b92b1d3SBarry Smith 2068*9b92b1d3SBarry Smith3. Start the bisection process with these known revisions. Build PETSc, and test your code 2069*9b92b1d3SBarry Smith to confirm known good/bad behavior: 2070*9b92b1d3SBarry Smith 2071*9b92b1d3SBarry Smith ```console 2072*9b92b1d3SBarry Smith $ git bisect start 21af4baa815c 5ae5ab319844 2073*9b92b1d3SBarry Smith ``` 2074*9b92b1d3SBarry Smith 2075*9b92b1d3SBarry Smith build/test, perhaps discover that this new state is bad 2076*9b92b1d3SBarry Smith 2077*9b92b1d3SBarry Smith ```console 2078*9b92b1d3SBarry Smith $ git bisect bad 2079*9b92b1d3SBarry Smith ``` 2080*9b92b1d3SBarry Smith 2081*9b92b1d3SBarry Smith build/test, perhaps discover that this state is good 2082*9b92b1d3SBarry Smith 2083*9b92b1d3SBarry Smith ```console 2084*9b92b1d3SBarry Smith $ git bisect good 2085*9b92b1d3SBarry Smith ``` 2086*9b92b1d3SBarry Smith 2087*9b92b1d3SBarry Smith Now until done - keep bisecting, building PETSc, and testing your code with it and 2088*9b92b1d3SBarry Smith determine if the code is working or not. After something like 5-15 iterations, `git 2089*9b92b1d3SBarry Smith bisect` will pin-point the exact code change that resulted in the difference in 2090*9b92b1d3SBarry Smith application behavior. 2091*9b92b1d3SBarry Smith 2092*9b92b1d3SBarry Smith:::{tip} 2093*9b92b1d3SBarry SmithSee [git-bisect(1)](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html) and the 2094*9b92b1d3SBarry Smith[debugging section of the Git Book](https://git-scm.com/book/en/Git-Tools-Debugging-with-Git) for more debugging tips. 2095*9b92b1d3SBarry Smith::: 2096*9b92b1d3SBarry Smith 2097*9b92b1d3SBarry Smith### How to fix the error "PMIX Error: error in file gds_ds12_lock_pthread.c"? 2098*9b92b1d3SBarry Smith 2099*9b92b1d3SBarry SmithThis seems to be an error when using Open MPI and OpenBLAS with threads (or perhaps other 2100*9b92b1d3SBarry Smithpackages that use threads). 2101*9b92b1d3SBarry Smith 2102*9b92b1d3SBarry Smith```console 2103*9b92b1d3SBarry Smith$ export PMIX_MCA_gds=hash 2104*9b92b1d3SBarry Smith``` 2105*9b92b1d3SBarry Smith 2106*9b92b1d3SBarry SmithShould resolve the problem. 2107*9b92b1d3SBarry Smith 2108*9b92b1d3SBarry Smith______________________________________________________________________ 2109*9b92b1d3SBarry Smith 2110*9b92b1d3SBarry Smith(doc_faq_sharedlibs)= 2111*9b92b1d3SBarry Smith 2112*9b92b1d3SBarry Smith## Shared Libraries 2113*9b92b1d3SBarry Smith 2114*9b92b1d3SBarry Smith### Can I install PETSc libraries as shared libraries? 2115*9b92b1d3SBarry Smith 2116*9b92b1d3SBarry SmithYes. Use 2117*9b92b1d3SBarry Smith 2118*9b92b1d3SBarry Smith```console 2119*9b92b1d3SBarry Smith$ ./configure --with-shared-libraries 2120*9b92b1d3SBarry Smith``` 2121*9b92b1d3SBarry Smith 2122*9b92b1d3SBarry Smith### Why should I use shared libraries? 2123*9b92b1d3SBarry Smith 2124*9b92b1d3SBarry SmithWhen you link to shared libraries, the function symbols from the shared libraries are not 2125*9b92b1d3SBarry Smithcopied in the executable. This way the size of the executable is considerably smaller than 2126*9b92b1d3SBarry Smithwhen using regular libraries. This helps in a couple of ways: 2127*9b92b1d3SBarry Smith 2128*9b92b1d3SBarry Smith- Saves disk space when more than one executable is created 2129*9b92b1d3SBarry Smith- Improves the compile time immensely, because the compiler has to write a much smaller 2130*9b92b1d3SBarry Smith file (executable) to the disk. 2131*9b92b1d3SBarry Smith 2132*9b92b1d3SBarry Smith### How do I link to the PETSc shared libraries? 2133*9b92b1d3SBarry Smith 2134*9b92b1d3SBarry SmithBy default, the compiler should pick up the shared libraries instead of the regular 2135*9b92b1d3SBarry Smithones. Nothing special should be done for this. 2136*9b92b1d3SBarry Smith 2137*9b92b1d3SBarry Smith### What if I want to link to the regular `.a` library files? 2138*9b92b1d3SBarry Smith 2139*9b92b1d3SBarry SmithYou must run `configure` with the option `--with-shared-libraries=0` (you can use a 2140*9b92b1d3SBarry Smithdifferent `$PETSC_ARCH` for this build so you can easily switch between the two). 2141*9b92b1d3SBarry Smith 2142*9b92b1d3SBarry Smith### What do I do if I want to move my executable to a different machine? 2143*9b92b1d3SBarry Smith 2144*9b92b1d3SBarry SmithYou would also need to have access to the shared libraries on this new machine. The other 2145*9b92b1d3SBarry Smithalternative is to build the executable without shared libraries by first deleting the 2146*9b92b1d3SBarry Smithshared libraries, and then creating the executable. 2147*9b92b1d3SBarry Smith 2148*9b92b1d3SBarry Smith```{bibliography} /petsc.bib 2149*9b92b1d3SBarry Smith:filter: docname in docnames 2150*9b92b1d3SBarry Smith``` 2151