| #
324c91e4
|
| 17-Dec-2013 |
Peter Brune <brune@mcs.anl.gov> |
Merge branch 'madams/gamg-destroy' into prbrune/pcgamg-classicalinterpolationstrategies
|
| #
578f55a3
|
| 17-Dec-2013 |
Peter Brune <brune@mcs.anl.gov> |
Merge branch 'master' into madams/gamg-destroy
Conflicts: src/ksp/pc/impls/gamg/gamg.c
|
| #
3d7bc6b7
|
| 15-Dec-2013 |
Matthew G. Knepley <knepley@gmail.com> |
Merge branch 'master' into knepley/feature-plex-refine-hex
* master: (68 commits) DMPlex: More fixup from bad rebase, moved to DMGet/SetCoordinateSection() DM: Added DMGet/SetCoordinateSection()
Merge branch 'master' into knepley/feature-plex-refine-hex
* master: (68 commits) DMPlex: More fixup from bad rebase, moved to DMGet/SetCoordinateSection() DM: Added DMGet/SetCoordinateSection() Fuckup: Fix for merge that is still untangling bad rebase from Oct. 17th 2013 - Fixed const in declarations - Fixed merge from PetscFE - Fixed 2 mallocs DMDA: Missing header - Damn rebase DMPlex ex3: Added 2D Q_1 and 3D P_2 tests DMDA: Added DMDAProjectFunction() and DMDAComputeL2Diff() DMDA: Fixed 2D geometry - Should reuse DMPlex routines DMDA: Fix bug in 2D closure operation for cells DMDA: Make closure operations also return the size DMDA: Changed DMDACreateSection() to emulate DMPlexCreateSection() DMDA: Added functions which emulate DMPlex functionality DMDA: Now DMDAGetNumCells() returns the cells in each direction as well PetscFE: Added support for tensor product cells in PetscDualSpace_Lagrange PetscFE: Added tensor product polynomial spaces example fix /sandbox/petsc/petsc.clone-3/include/petscmath.h(260): error: identifier "PETSC_CXX_STATIC_INLINE" is undefined PETSC_STATIC_INLINE PetscReal PetscAbsScalar(PetscScalar a) {return a < 0.0 ? -a : a;} ^ remove warning in fun3d example "user.F", line 1256: warning: In-place macro substitution leaves line truncated "user.F", line 1259: warning: In-place macro substitution leaves line truncated "user.F", line 1262: warning: In-place macro substitution leaves line truncated /Users/petsc/petsc.clone-2/src/ksp/ksp/examples/tutorials/ex56.c:59: warning: comparison between signed and unsigned integer expressions fix examples for portability fixed to example outputs fix output for a few examples changed due to changes to -mat_view and not using -snes_monitor_short ...
Conflicts: src/dm/impls/plex/plexrefine.c
show more ...
|
| #
9bfdf3ca
|
| 11-Dec-2013 |
Barry Smith <bsmith@mcs.anl.gov> |
Merge branch 'barry/update-xxxviewfromoptions'
|
| #
edbbd480
|
| 10-Dec-2013 |
Barry Smith <bsmith@mcs.anl.gov> |
Merge branch 'master' into barry/xcode
|
| #
bda0d2a0
|
| 04-Dec-2013 |
Jed Brown <jedbrown@mcs.anl.gov> |
VecAssembly BTS: VEC_SUBSET_OFF_PROC_ENTRIES elides global synchronization
SUBSET is currently a misnomer because the entries must currently match exactly. That will be fixed by checking an MPI_Sta
VecAssembly BTS: VEC_SUBSET_OFF_PROC_ENTRIES elides global synchronization
SUBSET is currently a misnomer because the entries must currently match exactly. That will be fixed by checking an MPI_Status.
show more ...
|
| #
ce1779c8
|
| 26-Nov-2013 |
Barry Smith <bsmith@mcs.anl.gov> |
introduce XXXViewFromOptions() and use consistently when possible
|
| #
c4fbd833
|
| 08-Nov-2013 |
Matthew G. Knepley <knepley@gmail.com> |
Merge branch 'master' into knepley/feature-fem-dgspace
* master: (593 commits) Bib: Added Top500 and fixed entry fun3d: update PetscMallocValidate() usage bib: rename ref with duplicate key
Merge branch 'master' into knepley/feature-fem-dgspace
* master: (593 commits) Bib: Added Top500 and fixed entry fun3d: update PetscMallocValidate() usage bib: rename ref with duplicate key Webpage: Corrected '-dm_mat_type cusp' to 'dm_mat_type aijcusp'. Allow calling MatGetBlockSize[s]() before matrix preallocation SNESLINESEARCHBT: Set the norms when exiting early due to negligible step. SNESQN: only monitor real part of dot product (fails with C++ complex) Corrected the #include statement in the man pages for the PetscObjectComposedDataSet/Getwhatever routines. DMPlex: Added doc for DMPlexGetHybridBounds() winzip: detect if winzip is used to extract petsc.tar.gz and error out. configure: remove self.archIndependent as its not being used. [it is used to create externalpackages/package/arch dir which is unused] configure: fix packages using package-dir/PETSC_ARCH as out-of-source-build location Perhaps this should be changed to package-dir/build? configure: remove dead code previously used to download BuildSystem configure: check if compilerDefines [and compilerFixes] exist in framework before using them configure: save/restore reconfigure.py when --with-clean is used configure: add --with-clean option to delete buildfiles/externalpackages in PETSC_DIR/PETSC_ARCH and potential externalpackages in --with-external-package-dir/PETSC_ARCH configure: With --with-externalpackages-dir=dir store/build packages in dir/PETSC_ARCH Add MatSeqSBAIJSetPreallocationCSR() configure: switch to using PETSC_DIR/PETSC_ARCH/externalpackages by updating package.py to use externalpackages.py as the externalPackagesDirProvider configure: move dead code configureExternalPackagesDir() from petscdir.py to externalpackagesdir.py Also set PETSC_DIR/PETSC_ARCH/externalpackages as the externalpackagesdir ...
show more ...
|
| #
5c448079
|
| 02-Nov-2013 |
Peter Brune <brune@mcs.anl.gov> |
Merge branch 'prbrune/sf-sfbasicops' into prbrune/mat-matcolor
|
| #
7737a228
|
| 31-Oct-2013 |
Barry Smith <bsmith@mcs.anl.gov> |
Merge branch 'master' into barry/saws
Conflicts: src/ksp/pc/impls/gamg/gamg.c src/sys/classes/viewer/impls/ams/ams.c src/sys/objects/pinit.c
|
| #
170be9ae
|
| 16-Oct-2013 |
Matthew G. Knepley <knepley@gmail.com> |
Merge branch 'master' into knepley/feature-dmda-section
* master: (397 commits) PetscSynchronizedFGets: fix deadlock at EOF Compiler: Fix warnings from MPI impls which do not initialize outputs
Merge branch 'master' into knepley/feature-dmda-section
* master: (397 commits) PetscSynchronizedFGets: fix deadlock at EOF Compiler: Fix warnings from MPI impls which do not initialize outputs Increase patchlevel to 3.4.3 SNES: Now work vectors come from the DM SNES ex62: Remove code generation SNES: Move setup involving snes->vec_sol from SNESSetUp() to SNESolve() Compiler: Fix warnings from MPI impls which do not initialize outputs DMPlex ex7: Added missing test output SNES ex52: Removed old Jacobian stuff, and fixed call to DMPlexProjectFunction() - Fixed calls for new element handling SNES ex12: Added a performance profiling mode DMPlex ex8: Fixed leak PetscSection: Must reset the section when changing the number of fields DMPlex ex7: Fixed test output - Corrected orientations in interpolation DMPlex ex1: Fixed test output - Uniform refinement for quads changed DMPlex: Fix overagressive checks PC: Removed support graph PC configure: both downloadonWindows and worksonWindows refer to MS compilers. Also fix isWindows() -> isWindows(CC). MatXAIJSetPreallocation: use array[] notation to help out Fortran __float128: when blas/lapack is not found instruct using f2cblaslapack Configure: Package.downloadonWindows is supposed to mean that Windows compilers work. not Cygwin - Also changed doc for Package.worksonWindows, which does refer to Cygwin ...
Conflicts: src/dm/impls/da/dalocal.c
show more ...
|
| #
0a58a46c
|
| 10-Oct-2013 |
Matthew G. Knepley <knepley@gmail.com> |
Merge branch 'knepley/fix-plex-examples' into knepley/feature-plex-refine-3d
* knepley/fix-plex-examples: (36 commits) DMPlex ex7: Added missing test output SNES ex52: Removed old Jacobian stuff
Merge branch 'knepley/fix-plex-examples' into knepley/feature-plex-refine-3d
* knepley/fix-plex-examples: (36 commits) DMPlex ex7: Added missing test output SNES ex52: Removed old Jacobian stuff, and fixed call to DMPlexProjectFunction() - Fixed calls for new element handling SNES ex12: Added a performance profiling mode DMPlex ex8: Fixed leak PetscSection: Must reset the section when changing the number of fields DMPlex ex7: Fixed test output - Corrected orientations in interpolation DMPlex ex1: Fixed test output - Uniform refinement for quads changed DMPlex: Fix overagressive checks configure: both downloadonWindows and worksonWindows refer to MS compilers. Also fix isWindows() -> isWindows(CC). MatXAIJSetPreallocation: use array[] notation to help out Fortran __float128: when blas/lapack is not found instruct using f2cblaslapack Configure: Package.downloadonWindows is supposed to mean that Windows compilers work. not Cygwin - Also changed doc for Package.worksonWindows, which does refer to Cygwin SNES example: fix bad merge config: only define DATAFILESPATH for non-null value config: define lang-specific macros in petscconf.h, choose in petscsys.h Mat: add MATOP_ values for missing functions DMPlex: use VECSTANDARD for coordinates PetscSF: Fixed PetscSFCreateEmbeddedSF() - We were using sf->nleaves for the leaf buffer, which is completely wrong. We have to use the largest thing in sf->mine[] replaced ISCreateGeneral in CompositeDM with ISCreateStride. MatNest calls ISStrideGetInfo on these vectors, which was causing errors DMPlex: Fix completely broken code in PetscSFCreateRemoteOffsets() - Non-broken example was in PetscSFDistributeSection() ...
Conflicts: src/dm/impls/plex/plex.c
show more ...
|
| #
603759cf
|
| 09-Oct-2013 |
Karl Rupp <rupp@iue.tuwien.ac.at> |
Merge branch 'paulmullowney/cusp-vector-scatter-with-fix'
* paulmullowney/cusp-vector-scatter-with-fix: CUDA/CUSP: Implementation of Sequential to Sequential Vector Scatters
|
| #
25ec7418
|
| 27-Sep-2013 |
Karl Rupp <rupp@iue.tuwien.ac.at> |
Original commit by Paul Mullowney (augmented by PETSC_HAVE_CUSP include guards by Karl Rupp): Implementation of Sequential to Sequential VecScatters on the GPU.
In this commit, I've built a working
Original commit by Paul Mullowney (augmented by PETSC_HAVE_CUSP include guards by Karl Rupp): Implementation of Sequential to Sequential VecScatters on the GPU.
In this commit, I've built a working prototype for sequential to sequential vector scatters for CUSP vectors. I've also reorganized the parallel to parallel vector scatters in a new infrastructure.
The design of the code is as follows. Currently, I distinguish between PtoP (parallel to parallel) VecScatters and StoS (sequential to sequential) VecScatters. In cuspvecimpl.h, a high level struct called PetscCUSPIndices stores a void * pointer and enumerated type describing whether the scatter is PtoP or StoS (and later PtoS and perhaps StoP). The actual type of scatter information built and stored in the void * pointer depends on the calling code.
For instance, in vscat.c, there are the methods VecScatterBegin_SGToSG, VecScatterBegin_SGToSS, ... These methods build the indices for the sequential scatter on demand via the function VecScatterCUSPIndicesCreate_StoS. Those indices are stored in a struct of type _p_VecScatterCUSPIndices_StoS. the appropriate data in that struct is filled depending on whether the input and output are general or strided. Some additional meta data is also stored.
A similar routine is called from vpscat.c for building the PtoP vector scatter. The structure of this hasn't changed although the constructing API method is now VecScatterCUSPIndicesCreate_PtoP (it used to be PetscCUSPIndicesCreate).
A single destructor function is given for the PtoP and StoS scatters: VecScatterCUSPIndicesDestroy. It used to be PetscCUSPIndicesDestroy.
I've made all of these methods PETSC_INTERN. I can't see any reason to expose these methods to the user.
All of the source code for doing the sequential scatter computation on the GPU is moved into a file vecscattercusp.cu (src/vec/vec/impls/seq/seqcusp). I haven't moved the VecCUSPCopyToGPUSome and similar methods into this file, although perhaps they belong there.
All scatter types implemented including FORWARD and REVERSE as well as INSERT, ADD, MAX.
show more ...
|
| #
256ff83f
|
| 11-Sep-2013 |
Barry Smith <bsmith@mcs.anl.gov> |
Merge branch 'master' into barry/wirth-fusion-materials
Conflicts: src/ts/examples/tutorials/advection-diffusion-reaction/ex10.c
|
| #
cc85fe4d
|
| 04-Sep-2013 |
Barry Smith <bsmith@mcs.anl.gov> |
Merge branch 'barry/dmvecmattypes' into barry/saws
Needed to work with version of PETSc that did not have constant calls to VecSetFromOptions() etc
Conflicts: src/ksp/ksp/interface/ams/kspams.c s
Merge branch 'barry/dmvecmattypes' into barry/saws
Needed to work with version of PETSc that did not have constant calls to VecSetFromOptions() etc
Conflicts: src/ksp/ksp/interface/ams/kspams.c src/snes/impls/composite/snescomposite.c src/snes/impls/gs/snesgs.c src/snes/impls/nasm/nasm.c src/snes/impls/ngmres/snesngmres.c
show more ...
|
| #
8117f98b
|
| 28-Aug-2013 |
Matthew G. Knepley <knepley@gmail.com> |
Merge branch 'master' into knepley/feature-dt-fem
* master: (211 commits) Mat ex170: Comments VTK: Small fix to error message (.vts to .vtu) VTK: Small fix to error message Fixed bib entries
Merge branch 'master' into knepley/feature-dt-fem
* master: (211 commits) Mat ex170: Comments VTK: Small fix to error message (.vts to .vtu) VTK: Small fix to error message Fixed bib entries Bib: Updates AO: fix erroneous processing of -ao_view and factor into AOViewFromOptions doc: fix named argument in {Vec,Mat,DM}ViewFromOptions Sys: add PetscDataTypeFromString() and test code Mat: Should say that it has a nullspace in MatView() parms: update tarball with fix for namespace conflict with metis fix citation 'Golub_Varga_1961' parmetis: update tarball to parmetis-4.0.2-p5 which fixes an install issue with cygwin Sys Logging: revert parent traversal fixed hdf5.py so that if self.libraries.compression is None the code still runs correctly DMDA: fix bad cast of DM_DA to PetscObject MatClique: follow DistMultiVec API changes MatClique: remove unused variables config cmakeboot: add C++ flags any time compiler is available config OpenMP: check for C++ flag any time the compiler is available replaced all left-over uses of a single PetscMalloc() to allocated multiple arrays: replaced with PetscMallocN() The only ones left are when the second array is set into the first array and one ugly usage in the MUMPS interface that cannot be easily fixed ...
Conflicts: config/builder.py src/dm/impls/plex/plexgeometry.c
show more ...
|
| #
459e96c1
|
| 28-Aug-2013 |
Matthew G. Knepley <knepley@gmail.com> |
Merge branch 'master' into knepley/feature-plex-refine-3d
* master: (273 commits) Mat ex170: Comments VTK: Small fix to error message (.vts to .vtu) VTK: Small fix to error message Fixed bib
Merge branch 'master' into knepley/feature-plex-refine-3d
* master: (273 commits) Mat ex170: Comments VTK: Small fix to error message (.vts to .vtu) VTK: Small fix to error message Fixed bib entries Bib: Updates AO: fix erroneous processing of -ao_view and factor into AOViewFromOptions doc: fix named argument in {Vec,Mat,DM}ViewFromOptions Sys: add PetscDataTypeFromString() and test code Mat: Should say that it has a nullspace in MatView() parms: update tarball with fix for namespace conflict with metis fix citation 'Golub_Varga_1961' parmetis: update tarball to parmetis-4.0.2-p5 which fixes an install issue with cygwin Sys Logging: revert parent traversal fixed hdf5.py so that if self.libraries.compression is None the code still runs correctly DMDA: fix bad cast of DM_DA to PetscObject MatClique: follow DistMultiVec API changes MatClique: remove unused variables config cmakeboot: add C++ flags any time compiler is available config OpenMP: check for C++ flag any time the compiler is available replaced all left-over uses of a single PetscMalloc() to allocated multiple arrays: replaced with PetscMallocN() The only ones left are when the second array is set into the first array and one ugly usage in the MUMPS interface that cannot be easily fixed ...
Conflicts: include/petscdmplex.h
show more ...
|
| #
c0c93d0e
|
| 28-Aug-2013 |
Matthew G. Knepley <knepley@gmail.com> |
Merge branch 'master' into knepley/feature-dmda-section
* master: (287 commits) Mat ex170: Comments VTK: Small fix to error message (.vts to .vtu) VTK: Small fix to error message Fixed bib e
Merge branch 'master' into knepley/feature-dmda-section
* master: (287 commits) Mat ex170: Comments VTK: Small fix to error message (.vts to .vtu) VTK: Small fix to error message Fixed bib entries Bib: Updates AO: fix erroneous processing of -ao_view and factor into AOViewFromOptions doc: fix named argument in {Vec,Mat,DM}ViewFromOptions Sys: add PetscDataTypeFromString() and test code Mat: Should say that it has a nullspace in MatView() parms: update tarball with fix for namespace conflict with metis fix citation 'Golub_Varga_1961' parmetis: update tarball to parmetis-4.0.2-p5 which fixes an install issue with cygwin Sys Logging: revert parent traversal fixed hdf5.py so that if self.libraries.compression is None the code still runs correctly DMDA: fix bad cast of DM_DA to PetscObject MatClique: follow DistMultiVec API changes MatClique: remove unused variables config cmakeboot: add C++ flags any time compiler is available config OpenMP: check for C++ flag any time the compiler is available replaced all left-over uses of a single PetscMalloc() to allocated multiple arrays: replaced with PetscMallocN() The only ones left are when the second array is set into the first array and one ugly usage in the MUMPS interface that cannot be easily fixed ...
show more ...
|
| #
ea6bb0ab
|
| 28-Aug-2013 |
Matthew G. Knepley <knepley@gmail.com> |
Merge branch 'knepley/reordering'
* knepley/reordering: Mat ex170: Comments Mat ex170: Test for MatMult() using max instead of plus - Finds the number of connected components in parallel - Can s
Merge branch 'knepley/reordering'
* knepley/reordering: Mat ex170: Comments Mat ex170: Test for MatMult() using max instead of plus - Finds the number of connected components in parallel - Can still optimize better in parallel Mat: Added stuff to let me do (max, mult) algebra things for reordering - Added MatMultMax_SeqAIJ() and MatMultAddMax_SeqAIJ() - Added PetscSparseDenseMaxDot() Vec: Added VecUniqueEntries() - This is purely diagnostic, but I think its useful for tests Sys: Added PetscSortRemoveDupsReal()
Conflicts: config/builder.py
show more ...
|
| #
b0418fcf
|
| 25-Jul-2013 |
Stefano Zampini <stefano.zampini@gmail.com> |
Merge remote-tracking branch 'origin/master' into stefano_zampini/pcbddc-improvelocalsolvers
|
| #
8533652c
|
| 25-Jul-2013 |
Stefano Zampini <stefano.zampini@gmail.com> |
Merge remote-tracking branch 'origin/master' into stefano_zampini/pcbddc-mirrorsfix
|
| #
6daa6ed0
|
| 25-Jul-2013 |
Stefano Zampini <stefano.zampini@gmail.com> |
Merge remote-tracking branch 'origin/master' into stefano_zampini/pcbddc-constraintssetupimproved
|
| #
72cfe0ad
|
| 23-Jul-2013 |
Karl Rupp <rupp@iue.tuwien.ac.at> |
Merge branch 'paulmullowney/txpetscgpu-package-removal'
|
| #
b06137fd
|
| 27-Jun-2013 |
Paul Mullowney <paulm@txcorp.com> |
Removing TXPETSCGPU from veccusp and mpiaijcusparse
In this next step of removing TXPETSCGPU, the host-device and device-host messaging code has been significantly simplified. In particular, all met
Removing TXPETSCGPU from veccusp and mpiaijcusparse
In this next step of removing TXPETSCGPU, the host-device and device-host messaging code has been significantly simplified. In particular, all methods VecCUSPCopyToGPU/FromGPU now use a cudaMemcpyAsync with a stream (and a stream synchronize()). This never hurts you. Moreover, it can help you in the case of the multi-GPU SpMV as this data transfer will overlap with the MatMult kernel. The more signficant change comes in VecCUSPCopyToGPUSome and VecCUSPCopyFromGPUSome. In this code, the data transfer now moves the smallest contiguous set of vector data containing ALL the indices in a single asynchronous data transfer. Then, the stream containing the data transfer is synchronized (not the entire device). While this can be wasteful in terms of messaging too much data, it has shown the best scalability performance across a wide range of matrices. Lastly the simplicity of the code is a significant advantage over the old way of doing the data transfer. Some old cold in these methods is "if 0"-ed out for reference and will be cleaned up later. One final optimization in the vector code involves registering the host buffer as page locked--which is done in VecCUSPAllocateCheck. Then, the buffer must be unregistered at VecDestroy_SeqCUSP. This shows a nice speedup in the data transfer for a parallel MatMult.
Also in this commit, I am removing the TXPETSCGPU dependence from the mpiaijcusparse class--it now depends only on CUDA. In order for the same stream to be used in the MatMult and MatMultAdd (necessary for an optimal Multi-GPU SpMV), the stream is built in the mpiaijcusparse and then passed in the seqaijcusparse data structure via a new method (MatCUSPARSESetStream). A similar method is added for the CUSPARSE library handle (context) as I think the stream needs to be attached to a particular context to work properly. When running in parallel, multiple GPUs, the references to the handle in the seqaijcusparse are cleared from the mpiaijcusparse classes with the method MatCUSPARSEClearHandle. Then, the mpiaijcusparse class deletes the handle.
One other non-trivial change was made to the seqaijcusparse. The alpha and beta parameters to the SpMV are now device data which is owned by the Mat_SEQAIJCUSPARSEMultStruct structure. This enables slightly better multi-GPU performance as this data does not need to be copied to the GPU at each kernel launch.
Multi-GPU SpMV now works without TXPETSCGPU and the performance is recovered as tested on up to 4 GPUs. Code is valgrind clean and cuda-memcheck clean.
Results of tests have been modified to have 1 less digit of precision. This yields consistent results across different GPUs. Lastly, the parallel test is set to run on a different matrix (shallow_water1) so that the iteration actually converges.
show more ...
|