aijviennacl.cxx - OpenGrok history log for /petsc/src/mat/impls/aij/seq/seqviennacl/aijviennacl.cxx

Revision	Date	Author	Comments
# 6881a170	06-Oct-2019	Satish Balay <balay@mcs.anl.gov>	Merge branch 'jczhang/fix-valid-gpu-array' into maint Rename: v->valid_GPU_array/matrix==> v->offloadmask and PetscOffloadFlag==>PetscOffloadMask See merge request petsc/petsc!2141
# c70f7ee4	02-Oct-2019	Junchao Zhang <jczhang@mcs.anl.gov>	Rename valid_GPU_array/matrix to offloadmask
# 8da4f93b	23-Sep-2019	Satish Balay <balay@mcs.anl.gov>	Merge branch 'stefanozampini/gpu-bddc' into 'master' Improvements towards BDDC on GPUs See merge request petsc/petsc!2067
# 99acd6aa	22-Sep-2019	Stefano Zampini <stefano.zampini@gmail.com>	Fix compilation error for nvcc in optimized code with AVX-512 (march=native on my GPU workstation) for some reason, the host compiler fails with this error message /home/zampins/Devel/petsc/include/ Fix compilation error for nvcc in optimized code with AVX-512 (march=native on my GPU workstation) for some reason, the host compiler fails with this error message /home/zampins/Devel/petsc/include/../src/mat/impls/aij/seq/aij.h(535): error: identifier "_mm512_reduce_add_pd" is undefined This optimized C kernel is not used in the GPU classes, so it is safe to skip its declaration show more ...
# f38c1e66	15-Sep-2019	Stefano Zampini <stefano.zampini@gmail.com>	MATSEQAIJVIENNACL: implement MatSeqAIJGetArray
# 6b804ed2	30-Jul-2019	Karl Rupp <me@karlrupp.net>	Merge branch 'stefano_zampini/GPU-matdensecuda' [PR #1911] * stefano_zampini/GPU-matdensecuda: GPU: Initial implementation for SeqDense class on GPUs.
# 1541652f	24-Jul-2019	Stefano Zampini <stefano.zampini@gmail.com>	MATSEQAIJVIENNACL: minor changes
# 489de41d	22-Jul-2019	Stefano Zampini <stefano.zampini@gmail.com>	MatSEQAIJ{CUSPARSE\|VIENNACL}: do not copy to the GPU if not at the final stage of assembly
# 7a71495b	04-Jul-2019	Karl Rupp <me@karlrupp.net>	Merge branch 'hannah/gpu-computation-logging' [PR #1843] * hannah/gpu-computation-logging: Adding GPU flop rate and GPU time.
# 7a052e47	03-Jul-2019	hannah_mairs <hannah.mairs@gmail.com>	PetscLogGpuTimeStart -> Begin
# 958c4211	01-Jul-2019	hannah_mairs <hannah.mairs@gmail.com>	Adding Gpu flop rate and GPU time
# 3b49ee3e	28-Jun-2019	Hannah Morgan <hannah.mairs@gmail.com>	Merged in hannah/gpu-communication-logging (pull request #1814) Hannah/gpu communication logging Approved-by: BarryFSmith <bsmith@mcs.anl.gov> Approved-by: Richard Mills <rtm@eecs.utk.edu>
# 4863603a	28-Jun-2019	Satish Balay <balay@mcs.anl.gov>	Adding vector logging, started matrix logging
# b6a92dca	26-Jun-2019	BarryFSmith <bsmith@mcs.anl.gov>	Merged in barry/cuda-multigrid-test (pull request #1763) Various improvements for GPUs (mostly for performance and CUDA)
# c56e2027	26-Jun-2019	BarryFSmith <bsmith@mcs.anl.gov>	Merged in barry/optimize-aij-da (pull request #1762) Non-numeric optimizations focused on AIJ, MatFDColoring, and DMCreateMatrix_DA_*AIJ, looking to improve performance in GPU environments
# 071fcb05	05-Jun-2019	Barry Smith <bsmith@mcs.anl.gov>	Non-numeric optimizations focused on AIJ, MatFDColoring, and DMCreateMatrix_DA_AIJ, looking to improve performance in GPU environments 1) PetscCalloc() now uses system calloc() 2) Merged some Pets Non-numeric optimizations focused on AIJ, MatFDColoring, and DMCreateMatrix_DA_AIJ, looking to improve performance in GPU environments 1) PetscCalloc() now uses system calloc() 2) Merged some PetscMalloc() 3) Eliminated unneeded PetscCalloc() 4) Removed some memory allocations and copies in MatFDColoringSetUp(), added local variables for better compiler optimization 5) Added MatSetValues_SeqAIJ_SortedFull(), added MatSetOption(MAT_SORTED_FULL) 6) Optimized DMCreateMatrix_DA_*AIJ for nonperiodic case to automatically have sorted columns (faster MatSetValues() times) 7) Eliminated call to PetscMemzero() in PetscFree() Commit-type: style-fix, feature show more ...
# fdc842d1	31-May-2019	Barry Smith <bsmith@mcs.anl.gov>	Various improvements for GPUs (mostly for performance and CUDA) 1) Add VecPinToCPU() for CUDA vector and matrices 2) Move initialization of cuBLAS to PetscInitialize() since it takes 1/2 second and Various improvements for GPUs (mostly for performance and CUDA) 1) Add VecPinToCPU() for CUDA vector and matrices 2) Move initialization of cuBLAS to PetscInitialize() since it takes 1/2 second and distorts timing with -log_view 3) Add logging for DMCreateMatrix (for large meshes this is very large) 4) Add VecGet/RestoreArrayWrite() to prevent unneeded copies from GPU (only implemented so far for CUDA); added a small number of usages in the source so that snes tutorials ex19 does not do unneeded communication from the GPU 5) Automatically convert MAIJ matrices to AIJ for CUDA since they are not yet supported natively in PETSc's CUDA matrix implementation 6) Pinned objects should still use the CUDA/ViennaCL versions of Destroy to clean up the GPU stuff Commit-type: feature show more ...
# 613bfe33	02-Jun-2019	BarryFSmith <bsmith@mcs.anl.gov>	Merged in barry/update-collective-on (pull request #1744) Update the use of Collective on in the manual pages to reflect the new style
# d083f849	01-Jun-2019	Barry Smith <bsmith@mcs.anl.gov>	Update the use of Collective on in the manual pages to reflect the new style Commit-type: style-fix, documentation Thanks-to: Patrick Sanan <patrick.sanan@gmail.com>
# 5065da2f	13-May-2019	Barry Smith <bsmith@mcs.anl.gov>	Merge branch 'master' of bitbucket.org:petsc/petsc
# 4edbe3a6	12-May-2019	Karl Rupp <me@karlrupp.net>	Merge branch 'barry/feature-pintocpu' [PR #1641] * barry/feature-pintocpu: Adding a MatPinToCPU() and VecPinToGPU() capability For matrices this will prevent copies to the GPU when they will never b Merge branch 'barry/feature-pintocpu' [PR #1641] * barry/feature-pintocpu: Adding a MatPinToCPU() and VecPinToGPU() capability For matrices this will prevent copies to the GPU when they will never be used there. For vectors this will prevent vectors from boucing back and forth between the CPU. show more ...
# e7e92044	07-May-2019	Barry Smith <bsmith@mcs.anl.gov>	Based on discussion with Oana I am adding a MatPinToCPU() and VecPinToGPU() capability. For matrices this will prevent copies to the GPU when they will never be used there. For vectors this will prev Based on discussion with Oana I am adding a MatPinToCPU() and VecPinToGPU() capability. For matrices this will prevent copies to the GPU when they will never be used there. For vectors this will prevent vectors from boucing back and forth between the CPU and GPU when most of the work is in the CPU. An example of the place that needs to avoid bouncing is in MatFDColoringApply_XXXX() Commit-type: feature, documentation, example Thanks-to: Oana Marin <oanam@mcs.anl.gov> show more ...
# a5a49157	25-Oct-2018	Joseph Pusztay <josephpusztay@Josephs-MacBook-Pro.local>	Merge branch 'master' into jpusztay/feature-swarm-symplectic-example
# e901d7f7	25-Oct-2018	Joseph Pusztay <josephpusztay@Josephs-MacBook-Pro.local>	Merge branch 'master' into jpustay/feature-swarm-example
# baeaa64e	25-Oct-2018	Joseph Pusztay <josephpu@buffalo.edu>	Merged petsc/petsc into master
1 2 345 6 7 8 9 10