History log of /petsc/src/mat/impls/aij/seq/seqviennacl/aijviennacl.cxx (Results 76 – 100 of 242)
Revision Date Author Comments
# 6881a170 06-Oct-2019 Satish Balay <balay@mcs.anl.gov>

Merge branch 'jczhang/fix-valid-gpu-array' into maint

Rename: v->valid_GPU_array/matrix==> v->offloadmask and PetscOffloadFlag==>PetscOffloadMask

See merge request petsc/petsc!2141


# c70f7ee4 02-Oct-2019 Junchao Zhang <jczhang@mcs.anl.gov>

Rename valid_GPU_array/matrix to offloadmask


# 8da4f93b 23-Sep-2019 Satish Balay <balay@mcs.anl.gov>

Merge branch 'stefanozampini/gpu-bddc' into 'master'

Improvements towards BDDC on GPUs

See merge request petsc/petsc!2067


# 99acd6aa 22-Sep-2019 Stefano Zampini <stefano.zampini@gmail.com>

Fix compilation error for nvcc in optimized code with AVX-512 (march=native on my GPU workstation)

for some reason, the host compiler fails with this error message
/home/zampins/Devel/petsc/include/

Fix compilation error for nvcc in optimized code with AVX-512 (march=native on my GPU workstation)

for some reason, the host compiler fails with this error message
/home/zampins/Devel/petsc/include/../src/mat/impls/aij/seq/aij.h(535): error: identifier "_mm512_reduce_add_pd" is undefined

This optimized C kernel is not used in the GPU classes, so it is safe to skip its declaration

show more ...


# f38c1e66 15-Sep-2019 Stefano Zampini <stefano.zampini@gmail.com>

MATSEQAIJVIENNACL: implement MatSeqAIJGetArray


# 6b804ed2 30-Jul-2019 Karl Rupp <me@karlrupp.net>

Merge branch 'stefano_zampini/GPU-matdensecuda' [PR #1911]

* stefano_zampini/GPU-matdensecuda:
GPU: Initial implementation for SeqDense class on GPUs.


# 1541652f 24-Jul-2019 Stefano Zampini <stefano.zampini@gmail.com>

MATSEQAIJVIENNACL: minor changes


# 489de41d 22-Jul-2019 Stefano Zampini <stefano.zampini@gmail.com>

MatSEQAIJ{CUSPARSE|VIENNACL}: do not copy to the GPU if not at the final stage of assembly


# 7a71495b 04-Jul-2019 Karl Rupp <me@karlrupp.net>

Merge branch 'hannah/gpu-computation-logging' [PR #1843]

* hannah/gpu-computation-logging:
Adding GPU flop rate and GPU time.


# 7a052e47 03-Jul-2019 hannah_mairs <hannah.mairs@gmail.com>

PetscLogGpuTimeStart -> Begin


# 958c4211 01-Jul-2019 hannah_mairs <hannah.mairs@gmail.com>

Adding Gpu flop rate and GPU time


# 3b49ee3e 28-Jun-2019 Hannah Morgan <hannah.mairs@gmail.com>

Merged in hannah/gpu-communication-logging (pull request #1814)

Hannah/gpu communication logging

Approved-by: BarryFSmith <bsmith@mcs.anl.gov>
Approved-by: Richard Mills <rtm@eecs.utk.edu>


# 4863603a 28-Jun-2019 Satish Balay <balay@mcs.anl.gov>

Adding vector logging, started matrix logging


# b6a92dca 26-Jun-2019 BarryFSmith <bsmith@mcs.anl.gov>

Merged in barry/cuda-multigrid-test (pull request #1763)

Various improvements for GPUs (mostly for performance and CUDA)


# c56e2027 26-Jun-2019 BarryFSmith <bsmith@mcs.anl.gov>

Merged in barry/optimize-aij-da (pull request #1762)

Non-numeric optimizations focused on AIJ, MatFDColoring, and DMCreateMatrix_DA_*AIJ, looking to improve performance in GPU environments


# 071fcb05 05-Jun-2019 Barry Smith <bsmith@mcs.anl.gov>

Non-numeric optimizations focused on AIJ, MatFDColoring, and DMCreateMatrix_DA_*AIJ, looking to improve performance in GPU environments

1) PetscCalloc*() now uses system calloc()
2) Merged some Pets

Non-numeric optimizations focused on AIJ, MatFDColoring, and DMCreateMatrix_DA_*AIJ, looking to improve performance in GPU environments

1) PetscCalloc*() now uses system calloc()
2) Merged some PetscMalloc*()
3) Eliminated unneeded PetscCalloc*()
4) Removed some memory allocations and copies in MatFDColoringSetUp(), added local variables for better compiler optimization
5) Added MatSetValues_SeqAIJ_SortedFull(), added MatSetOption(MAT_SORTED_FULL)
6) Optimized DMCreateMatrix_DA_*AIJ for nonperiodic case to automatically have sorted columns (faster MatSetValues() times)
7) Eliminated call to PetscMemzero() in PetscFree()

Commit-type: style-fix, feature

show more ...


# fdc842d1 31-May-2019 Barry Smith <bsmith@mcs.anl.gov>

Various improvements for GPUs (mostly for performance and CUDA)

1) Add VecPinToCPU() for CUDA vector and matrices
2) Move initialization of cuBLAS to PetscInitialize() since it takes 1/2 second and

Various improvements for GPUs (mostly for performance and CUDA)

1) Add VecPinToCPU() for CUDA vector and matrices
2) Move initialization of cuBLAS to PetscInitialize() since it takes 1/2 second and distorts timing with -log_view
3) Add logging for DMCreateMatrix (for large meshes this is very large)
4) Add VecGet/RestoreArrayWrite() to prevent unneeded copies from GPU (only implemented so far for CUDA);
added a small number of usages in the source so that snes tutorials ex19 does not do unneeded communication from the GPU
5) Automatically convert MAIJ matrices to AIJ for CUDA since they are not yet supported natively in PETSc's CUDA matrix implementation
6) Pinned objects should still use the CUDA/ViennaCL versions of Destroy to clean up the GPU stuff

Commit-type: feature

show more ...


# 613bfe33 02-Jun-2019 BarryFSmith <bsmith@mcs.anl.gov>

Merged in barry/update-collective-on (pull request #1744)

Update the use of Collective on in the manual pages to reflect the new style


# d083f849 01-Jun-2019 Barry Smith <bsmith@mcs.anl.gov>

Update the use of Collective on in the manual pages to reflect the new style

Commit-type: style-fix, documentation
Thanks-to: Patrick Sanan <patrick.sanan@gmail.com>


# 5065da2f 13-May-2019 Barry Smith <bsmith@mcs.anl.gov>

Merge branch 'master' of bitbucket.org:petsc/petsc


# 4edbe3a6 12-May-2019 Karl Rupp <me@karlrupp.net>

Merge branch 'barry/feature-pintocpu' [PR #1641]

* barry/feature-pintocpu:
Adding a MatPinToCPU() and VecPinToGPU() capability
For matrices this will prevent copies to the GPU when they will never b

Merge branch 'barry/feature-pintocpu' [PR #1641]

* barry/feature-pintocpu:
Adding a MatPinToCPU() and VecPinToGPU() capability
For matrices this will prevent copies to the GPU when they will never be used there.
For vectors this will prevent vectors from boucing back and forth between the CPU.

show more ...


# e7e92044 07-May-2019 Barry Smith <bsmith@mcs.anl.gov>

Based on discussion with Oana I am adding a MatPinToCPU() and VecPinToGPU() capability. For matrices this
will prevent copies to the GPU when they will never be used there. For vectors this will
prev

Based on discussion with Oana I am adding a MatPinToCPU() and VecPinToGPU() capability. For matrices this
will prevent copies to the GPU when they will never be used there. For vectors this will
prevent vectors from boucing back and forth between the CPU and GPU when most of the work is in the CPU. An
example of the place that needs to avoid bouncing is in MatFDColoringApply_XXXX()

Commit-type: feature, documentation, example
Thanks-to: Oana Marin <oanam@mcs.anl.gov>

show more ...


# a5a49157 25-Oct-2018 Joseph Pusztay <josephpusztay@Josephs-MacBook-Pro.local>

Merge branch 'master' into jpusztay/feature-swarm-symplectic-example


# e901d7f7 25-Oct-2018 Joseph Pusztay <josephpusztay@Josephs-MacBook-Pro.local>

Merge branch 'master' into jpustay/feature-swarm-example


# baeaa64e 25-Oct-2018 Joseph Pusztay <josephpu@buffalo.edu>

Merged petsc/petsc into master


12345678910