mpiaijcusparse.cu - OpenGrok history log for /petsc/src/mat/impls/aij/mpi/mpicusparse/mpiaijcusparse.cu

Revision	Date	Author	Comments
# 4d55d066	24-Feb-2020	Junchao Zhang <jczhang@mcs.anl.gov>	Delete out-of-date comments and do better overlap
# f6516afe	03-Feb-2020	Satish Balay <balay@mcs.anl.gov>	Merge branch 'rmills/bindtocpu-not-pintocpu' into 'master' Changed XXXPinToCPU() to XXXBindToCPU() to prevent confusion. See merge request petsc/petsc!2477
# b470e4b4	03-Feb-2020	Richard Tran Mills <rmills@rmills.org>	Changed XXXPinToCPU() to XXXBindToCPU() to prevent confusion. The reason for this change is that we already use the terminology "pinned" to refer to memory that is non-pageable, in the context of Pe Changed XXXPinToCPU() to XXXBindToCPU() to prevent confusion. The reason for this change is that we already use the terminology "pinned" to refer to memory that is non-pageable, in the context of PetscSF as well as allocating host memory when GPUs are being employed. show more ...
# f3e33b7c	30-Oct-2019	Satish Balay <balay@mcs.anl.gov>	Merge branch 'jczhang/feature-cuda-error-string' into 'master' Map a cuda error code to its name and description See merge request petsc/petsc!2228
# 57d48284	30-Oct-2019	Junchao Zhang <jczhang@mcs.anl.gov>	Map a cuda error code to its name and description
# 040e670d	25-Sep-2019	Satish Balay <balay@mcs.anl.gov>	Merge branch 'karlrupp/fix-cuda-streams' into 'master' GPU: Fixed incorrect use of CUDA streams, SNES ex19 and ex56 now working with CUDA See merge request petsc/petsc!2091
# 17403302	24-Sep-2019	Karl Rupp <me@karlrupp.net>	CUDA: Fixed incorrect use of separate streams. This solves synchronization problems that have arisen due to the incorrect use of multiple CUDA streams for vector and matrix operations (without using CUDA: Fixed incorrect use of separate streams. This solves synchronization problems that have arisen due to the incorrect use of multiple CUDA streams for vector and matrix operations (without using proper synchronization mechanisms). In particular, SNES ex19 and ex56 now run reliably (no failure after 20+ reruns). Instead, the default stream (NULL pointer) is now used for all CUDA operations. I don't have performance comparisons at hand for the performance implications in this commit, but expect any changes to be small. Correctness first :-) show more ...
# 8da4f93b	23-Sep-2019	Satish Balay <balay@mcs.anl.gov>	Merge branch 'stefanozampini/gpu-bddc' into 'master' Improvements towards BDDC on GPUs See merge request petsc/petsc!2067
# 99acd6aa	22-Sep-2019	Stefano Zampini <stefano.zampini@gmail.com>	Fix compilation error for nvcc in optimized code with AVX-512 (march=native on my GPU workstation) for some reason, the host compiler fails with this error message /home/zampins/Devel/petsc/include/ Fix compilation error for nvcc in optimized code with AVX-512 (march=native on my GPU workstation) for some reason, the host compiler fails with this error message /home/zampins/Devel/petsc/include/../src/mat/impls/aij/seq/aij.h(535): error: identifier "_mm512_reduce_add_pd" is undefined This optimized C kernel is not used in the GPU classes, so it is safe to skip its declaration show more ...
# 29ad97fd	07-Aug-2019	Karl Rupp <me@karlrupp.net>	Merge branch 'dalcinl/feature-math' [PR #1904] * dalcinl/feature-math: Math & PetscComplex: Various enhancements - Define PetscXXXScalar to PetscXXXReal for real scalar type - Add PetscCbrtReal(), P Merge branch 'dalcinl/feature-math' [PR #1904] * dalcinl/feature-math: Math & PetscComplex: Various enhancements - Define PetscXXXScalar to PetscXXXReal for real scalar type - Add PetscCbrtReal(), PetscHypotReal(), and PetscAtan2Real() - Add PetscArgComplex() and PetscArgScalar() - Add PetscAtan{Real\|Complex\|Scalar}() - Add PetscA{sin\|cos\|tan}h{Real\|Complex\|Scalar}() - Docs: Petsc{Real\|Imaginary}Part() return PetscReal - Define __fp16 constants to use "F" suffix (ie. single precision) - Fix PETSC_[SQRT_]MACHINE_EPSILON values for __fp16 PetscComplex: Remove PETSC_USE_CXX_COMPLEX_FLOAT_WORKAROUND - Move the C++ complex fixes to its own header file - Define PETSC_SKIP_CXX_COMPLEX_FIX to skip the C++ complex fixes show more ...
# 7afe75c1	06-Aug-2019	Karl Rupp <me@karlrupp.net>	Merge branch 'karlrupp/fix-cuda-MatSeqAIJCUSPARSEGenerateTransposeForMult' [PR #1948] * karlrupp/fix-cuda-MatSeqAIJCUSPARSEGenerateTransposeForMult: CUDA: Fixed issues in MatSeqAIJCUSPARSEGenerateTr Merge branch 'karlrupp/fix-cuda-MatSeqAIJCUSPARSEGenerateTransposeForMult' [PR #1948] * karlrupp/fix-cuda-MatSeqAIJCUSPARSEGenerateTransposeForMult: CUDA: Fixed issues in MatSeqAIJCUSPARSEGenerateTransposeForMult and MatMultTransposeAdd_SeqAIJCUSPARSE show more ...
# a3fdcf43	05-Aug-2019	Karl Rupp <me@karlrupp.net>	CUDA: Fixed issues in MatSeqAIJCUSPARSEGenerateTransposeForMult and MatMultTransposeAdd_SeqAIJCUSPARSE This is a cherry-pick of commits dde4751, 435e334, 1d884b8, 4e32a5a Thanks-to: Mark Adams <ma23 CUDA: Fixed issues in MatSeqAIJCUSPARSEGenerateTransposeForMult and MatMultTransposeAdd_SeqAIJCUSPARSE This is a cherry-pick of commits dde4751, 435e334, 1d884b8, 4e32a5a Thanks-to: Mark Adams <ma2325@columbia.edu> show more ...
# 53800007	05-Aug-2019	Karl Rupp <me@karlrupp.net>	CUDA: Skipping CXX complex fix. Should fix warnings obtained with newer math functions. This fix should be obsolete once the wrapper for GPU functionality is in place.
# b6a92dca	26-Jun-2019	BarryFSmith <bsmith@mcs.anl.gov>	Merged in barry/cuda-multigrid-test (pull request #1763) Various improvements for GPUs (mostly for performance and CUDA)
# fdc842d1	31-May-2019	Barry Smith <bsmith@mcs.anl.gov>	Various improvements for GPUs (mostly for performance and CUDA) 1) Add VecPinToCPU() for CUDA vector and matrices 2) Move initialization of cuBLAS to PetscInitialize() since it takes 1/2 second and Various improvements for GPUs (mostly for performance and CUDA) 1) Add VecPinToCPU() for CUDA vector and matrices 2) Move initialization of cuBLAS to PetscInitialize() since it takes 1/2 second and distorts timing with -log_view 3) Add logging for DMCreateMatrix (for large meshes this is very large) 4) Add VecGet/RestoreArrayWrite() to prevent unneeded copies from GPU (only implemented so far for CUDA); added a small number of usages in the source so that snes tutorials ex19 does not do unneeded communication from the GPU 5) Automatically convert MAIJ matrices to AIJ for CUDA since they are not yet supported natively in PETSc's CUDA matrix implementation 6) Pinned objects should still use the CUDA/ViennaCL versions of Destroy to clean up the GPU stuff Commit-type: feature show more ...
# 613bfe33	02-Jun-2019	BarryFSmith <bsmith@mcs.anl.gov>	Merged in barry/update-collective-on (pull request #1744) Update the use of Collective on in the manual pages to reflect the new style
# d083f849	01-Jun-2019	Barry Smith <bsmith@mcs.anl.gov>	Update the use of Collective on in the manual pages to reflect the new style Commit-type: style-fix, documentation Thanks-to: Patrick Sanan <patrick.sanan@gmail.com>
# a041468a	06-Mar-2019	Lawrence Mitchell <lawrence@wence.uk>	Merge branch 'master' into wence/feature-patch-all-at-once
# 8b2e997c	24-Feb-2019	Karl Rupp <me@karlrupp.net>	Merge branch 'jczhang/fix-vecscatter-cuda/maint' into maint [PR #1388] * jczhang/fix-vecscatter-cuda/maint: CUDA vecscatter needs to take care of the ScatterMode argument
# 29302ad0	24-Feb-2019	Karl Rupp <me@karlrupp.net>	Merge branch 'jczhang/fix-vecscatter-cuda/maint' [PR #1388] * jczhang/fix-vecscatter-cuda/maint: CUDA vecscatter needs to take care of the ScatterMode argument
# a5873c6d	24-Feb-2019	Karl Rupp <me@karlrupp.net>	Merge branch 'jczhang/restore-error-check' [PR #1392] * jczhang/restore-error-check: Restore an error checking line in MatMultTranspose_MPIAIJCUSPARSE
# ccf5f80b	21-Feb-2019	Junchao Zhang <jczhang@mcs.anl.gov>	Restore the error checking code
# 959dcdf5	19-Feb-2019	Junchao Zhang <jczhang@mcs.anl.gov>	Add a ScatterMode arg in cuda vecscat to select to/from context The old code VecScatterInitializeForGPU() initializes the pointer (PetscCUDAIndices)&inctx->spptr) based on an input ScatterMode befo Add a ScatterMode arg in cuda vecscat to select to/from context The old code VecScatterInitializeForGPU() initializes the pointer (PetscCUDAIndices)&inctx->spptr) based on an input ScatterMode before VecScatterBegin() is called. If a vecscatter context is firstly used for a SCATTER_FORWARD, and secondly used for a SCATTER_REVERSE, there will be an error. Since in the second VecScatter, it uses out-of-date (PetscCUDAIndices)&inctx->spptr) The solution is "do not prematurely consider ScatterMode when building (PetscCUDAIndices)&inctx->spptr). Instead, select correct to/from until VecScatterBegin() is called" show more ...
# a5a49157	25-Oct-2018	Joseph Pusztay <josephpusztay@Josephs-MacBook-Pro.local>	Merge branch 'master' into jpusztay/feature-swarm-symplectic-example
# e901d7f7	25-Oct-2018	Joseph Pusztay <josephpusztay@Josephs-MacBook-Pro.local>	Merge branch 'master' into jpustay/feature-swarm-example
1 2 3 4 5 6 789 10 >>...15