History log of /petsc/src/mat/impls/aij/mpi/mpicusparse/mpiaijcusparse.cu (Results 176 – 200 of 371)
Revision Date Author Comments
# 4d55d066 24-Feb-2020 Junchao Zhang <jczhang@mcs.anl.gov>

Delete out-of-date comments and do better overlap


# f6516afe 03-Feb-2020 Satish Balay <balay@mcs.anl.gov>

Merge branch 'rmills/bindtocpu-not-pintocpu' into 'master'

Changed XXXPinToCPU() to XXXBindToCPU() to prevent confusion.

See merge request petsc/petsc!2477


# b470e4b4 03-Feb-2020 Richard Tran Mills <rmills@rmills.org>

Changed XXXPinToCPU() to XXXBindToCPU() to prevent confusion.

The reason for this change is that we already use the terminology
"pinned" to refer to memory that is non-pageable, in the context of
Pe

Changed XXXPinToCPU() to XXXBindToCPU() to prevent confusion.

The reason for this change is that we already use the terminology
"pinned" to refer to memory that is non-pageable, in the context of
PetscSF as well as allocating host memory when GPUs are being employed.

show more ...


# f3e33b7c 30-Oct-2019 Satish Balay <balay@mcs.anl.gov>

Merge branch 'jczhang/feature-cuda-error-string' into 'master'

Map a cuda error code to its name and description

See merge request petsc/petsc!2228


# 57d48284 30-Oct-2019 Junchao Zhang <jczhang@mcs.anl.gov>

Map a cuda error code to its name and description


# 040e670d 25-Sep-2019 Satish Balay <balay@mcs.anl.gov>

Merge branch 'karlrupp/fix-cuda-streams' into 'master'

GPU: Fixed incorrect use of CUDA streams, SNES ex19 and ex56 now working with CUDA

See merge request petsc/petsc!2091


# 17403302 24-Sep-2019 Karl Rupp <me@karlrupp.net>

CUDA: Fixed incorrect use of separate streams.

This solves synchronization problems that have arisen due to the incorrect use of multiple CUDA streams for vector and matrix operations (without using

CUDA: Fixed incorrect use of separate streams.

This solves synchronization problems that have arisen due to the incorrect use of multiple CUDA streams for vector and matrix operations (without using proper synchronization mechanisms).
In particular, SNES ex19 and ex56 now run reliably (no failure after 20+ reruns).
Instead, the default stream (NULL pointer) is now used for all CUDA operations.
I don't have performance comparisons at hand for the performance implications in this commit, but expect any changes to be small.
Correctness first :-)

show more ...


# 8da4f93b 23-Sep-2019 Satish Balay <balay@mcs.anl.gov>

Merge branch 'stefanozampini/gpu-bddc' into 'master'

Improvements towards BDDC on GPUs

See merge request petsc/petsc!2067


# 99acd6aa 22-Sep-2019 Stefano Zampini <stefano.zampini@gmail.com>

Fix compilation error for nvcc in optimized code with AVX-512 (march=native on my GPU workstation)

for some reason, the host compiler fails with this error message
/home/zampins/Devel/petsc/include/

Fix compilation error for nvcc in optimized code with AVX-512 (march=native on my GPU workstation)

for some reason, the host compiler fails with this error message
/home/zampins/Devel/petsc/include/../src/mat/impls/aij/seq/aij.h(535): error: identifier "_mm512_reduce_add_pd" is undefined

This optimized C kernel is not used in the GPU classes, so it is safe to skip its declaration

show more ...


# 29ad97fd 07-Aug-2019 Karl Rupp <me@karlrupp.net>

Merge branch 'dalcinl/feature-math' [PR #1904]

* dalcinl/feature-math:
Math & PetscComplex: Various enhancements
- Define PetscXXXScalar to PetscXXXReal for real scalar type
- Add PetscCbrtReal(), P

Merge branch 'dalcinl/feature-math' [PR #1904]

* dalcinl/feature-math:
Math & PetscComplex: Various enhancements
- Define PetscXXXScalar to PetscXXXReal for real scalar type
- Add PetscCbrtReal(), PetscHypotReal(), and PetscAtan2Real()
- Add PetscArgComplex() and PetscArgScalar()
- Add PetscAtan{Real|Complex|Scalar}()
- Add PetscA{sin|cos|tan}h{Real|Complex|Scalar}()
- Docs: Petsc{Real|Imaginary}Part() return PetscReal
- Define __fp16 constants to use "F" suffix (ie. single precision)
- Fix PETSC_[SQRT_]MACHINE_EPSILON values for __fp16

PetscComplex: Remove PETSC_USE_CXX_COMPLEX_FLOAT_WORKAROUND

- Move the C++ complex fixes to its own header file
- Define PETSC_SKIP_CXX_COMPLEX_FIX to skip the C++ complex fixes

show more ...


# 7afe75c1 06-Aug-2019 Karl Rupp <me@karlrupp.net>

Merge branch 'karlrupp/fix-cuda-MatSeqAIJCUSPARSEGenerateTransposeForMult' [PR #1948]

* karlrupp/fix-cuda-MatSeqAIJCUSPARSEGenerateTransposeForMult:
CUDA: Fixed issues in MatSeqAIJCUSPARSEGenerateTr

Merge branch 'karlrupp/fix-cuda-MatSeqAIJCUSPARSEGenerateTransposeForMult' [PR #1948]

* karlrupp/fix-cuda-MatSeqAIJCUSPARSEGenerateTransposeForMult:
CUDA: Fixed issues in MatSeqAIJCUSPARSEGenerateTransposeForMult and MatMultTransposeAdd_SeqAIJCUSPARSE

show more ...


# a3fdcf43 05-Aug-2019 Karl Rupp <me@karlrupp.net>

CUDA: Fixed issues in MatSeqAIJCUSPARSEGenerateTransposeForMult and MatMultTransposeAdd_SeqAIJCUSPARSE

This is a cherry-pick of commits dde4751, 435e334, 1d884b8, 4e32a5a
Thanks-to: Mark Adams <ma23

CUDA: Fixed issues in MatSeqAIJCUSPARSEGenerateTransposeForMult and MatMultTransposeAdd_SeqAIJCUSPARSE

This is a cherry-pick of commits dde4751, 435e334, 1d884b8, 4e32a5a
Thanks-to: Mark Adams <ma2325@columbia.edu>

show more ...


# 53800007 05-Aug-2019 Karl Rupp <me@karlrupp.net>

CUDA: Skipping CXX complex fix.

Should fix warnings obtained with newer math functions.

This fix should be obsolete once the wrapper for GPU functionality is in place.


# b6a92dca 26-Jun-2019 BarryFSmith <bsmith@mcs.anl.gov>

Merged in barry/cuda-multigrid-test (pull request #1763)

Various improvements for GPUs (mostly for performance and CUDA)


# fdc842d1 31-May-2019 Barry Smith <bsmith@mcs.anl.gov>

Various improvements for GPUs (mostly for performance and CUDA)

1) Add VecPinToCPU() for CUDA vector and matrices
2) Move initialization of cuBLAS to PetscInitialize() since it takes 1/2 second and

Various improvements for GPUs (mostly for performance and CUDA)

1) Add VecPinToCPU() for CUDA vector and matrices
2) Move initialization of cuBLAS to PetscInitialize() since it takes 1/2 second and distorts timing with -log_view
3) Add logging for DMCreateMatrix (for large meshes this is very large)
4) Add VecGet/RestoreArrayWrite() to prevent unneeded copies from GPU (only implemented so far for CUDA);
added a small number of usages in the source so that snes tutorials ex19 does not do unneeded communication from the GPU
5) Automatically convert MAIJ matrices to AIJ for CUDA since they are not yet supported natively in PETSc's CUDA matrix implementation
6) Pinned objects should still use the CUDA/ViennaCL versions of Destroy to clean up the GPU stuff

Commit-type: feature

show more ...


# 613bfe33 02-Jun-2019 BarryFSmith <bsmith@mcs.anl.gov>

Merged in barry/update-collective-on (pull request #1744)

Update the use of Collective on in the manual pages to reflect the new style


# d083f849 01-Jun-2019 Barry Smith <bsmith@mcs.anl.gov>

Update the use of Collective on in the manual pages to reflect the new style

Commit-type: style-fix, documentation
Thanks-to: Patrick Sanan <patrick.sanan@gmail.com>


# a041468a 06-Mar-2019 Lawrence Mitchell <lawrence@wence.uk>

Merge branch 'master' into wence/feature-patch-all-at-once


# 8b2e997c 24-Feb-2019 Karl Rupp <me@karlrupp.net>

Merge branch 'jczhang/fix-vecscatter-cuda/maint' into maint [PR #1388]

* jczhang/fix-vecscatter-cuda/maint:
CUDA vecscatter needs to take care of the ScatterMode argument


# 29302ad0 24-Feb-2019 Karl Rupp <me@karlrupp.net>

Merge branch 'jczhang/fix-vecscatter-cuda/maint' [PR #1388]

* jczhang/fix-vecscatter-cuda/maint:
CUDA vecscatter needs to take care of the ScatterMode argument


# a5873c6d 24-Feb-2019 Karl Rupp <me@karlrupp.net>

Merge branch 'jczhang/restore-error-check' [PR #1392]

* jczhang/restore-error-check:
Restore an error checking line in MatMultTranspose_MPIAIJCUSPARSE


# ccf5f80b 21-Feb-2019 Junchao Zhang <jczhang@mcs.anl.gov>

Restore the error checking code


# 959dcdf5 19-Feb-2019 Junchao Zhang <jczhang@mcs.anl.gov>

Add a ScatterMode arg in cuda vecscat to select to/from context

The old code VecScatterInitializeForGPU() initializes the pointer (PetscCUDAIndices*)&inctx->spptr) based
on an input ScatterMode befo

Add a ScatterMode arg in cuda vecscat to select to/from context

The old code VecScatterInitializeForGPU() initializes the pointer (PetscCUDAIndices*)&inctx->spptr) based
on an input ScatterMode before VecScatterBegin() is called.

If a vecscatter context is firstly used for a SCATTER_FORWARD, and secondly used
for a SCATTER_REVERSE, there will be an error. Since in the second VecScatter,
it uses out-of-date (PetscCUDAIndices*)&inctx->spptr)

The solution is "do not prematurely consider ScatterMode when building (PetscCUDAIndices*)&inctx->spptr). Instead,
select correct to/from until VecScatterBegin() is called"

show more ...


# a5a49157 25-Oct-2018 Joseph Pusztay <josephpusztay@Josephs-MacBook-Pro.local>

Merge branch 'master' into jpusztay/feature-swarm-symplectic-example


# e901d7f7 25-Oct-2018 Joseph Pusztay <josephpusztay@Josephs-MacBook-Pro.local>

Merge branch 'master' into jpustay/feature-swarm-example


12345678910>>...15