| #
6881a170
|
| 06-Oct-2019 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jczhang/fix-valid-gpu-array' into maint
Rename: v->valid_GPU_array/matrix==> v->offloadmask and PetscOffloadFlag==>PetscOffloadMask
See merge request petsc/petsc!2141
|
| #
c70f7ee4
|
| 02-Oct-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Rename valid_GPU_array/matrix to offloadmask
|
| #
8da4f93b
|
| 23-Sep-2019 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'stefanozampini/gpu-bddc' into 'master'
Improvements towards BDDC on GPUs
See merge request petsc/petsc!2067
|
| #
99acd6aa
|
| 22-Sep-2019 |
Stefano Zampini <stefano.zampini@gmail.com> |
Fix compilation error for nvcc in optimized code with AVX-512 (march=native on my GPU workstation)
for some reason, the host compiler fails with this error message /home/zampins/Devel/petsc/include/
Fix compilation error for nvcc in optimized code with AVX-512 (march=native on my GPU workstation)
for some reason, the host compiler fails with this error message /home/zampins/Devel/petsc/include/../src/mat/impls/aij/seq/aij.h(535): error: identifier "_mm512_reduce_add_pd" is undefined
This optimized C kernel is not used in the GPU classes, so it is safe to skip its declaration
show more ...
|
| #
f38c1e66
|
| 15-Sep-2019 |
Stefano Zampini <stefano.zampini@gmail.com> |
MATSEQAIJVIENNACL: implement MatSeqAIJGetArray
|
| #
6b804ed2
|
| 30-Jul-2019 |
Karl Rupp <me@karlrupp.net> |
Merge branch 'stefano_zampini/GPU-matdensecuda' [PR #1911]
* stefano_zampini/GPU-matdensecuda: GPU: Initial implementation for SeqDense class on GPUs.
|
| #
1541652f
|
| 24-Jul-2019 |
Stefano Zampini <stefano.zampini@gmail.com> |
MATSEQAIJVIENNACL: minor changes
|
| #
489de41d
|
| 22-Jul-2019 |
Stefano Zampini <stefano.zampini@gmail.com> |
MatSEQAIJ{CUSPARSE|VIENNACL}: do not copy to the GPU if not at the final stage of assembly
|
| #
7a71495b
|
| 04-Jul-2019 |
Karl Rupp <me@karlrupp.net> |
Merge branch 'hannah/gpu-computation-logging' [PR #1843]
* hannah/gpu-computation-logging: Adding GPU flop rate and GPU time.
|
| #
7a052e47
|
| 03-Jul-2019 |
hannah_mairs <hannah.mairs@gmail.com> |
PetscLogGpuTimeStart -> Begin
|
| #
958c4211
|
| 01-Jul-2019 |
hannah_mairs <hannah.mairs@gmail.com> |
Adding Gpu flop rate and GPU time
|
| #
3b49ee3e
|
| 28-Jun-2019 |
Hannah Morgan <hannah.mairs@gmail.com> |
Merged in hannah/gpu-communication-logging (pull request #1814)
Hannah/gpu communication logging
Approved-by: BarryFSmith <bsmith@mcs.anl.gov> Approved-by: Richard Mills <rtm@eecs.utk.edu>
|
| #
4863603a
|
| 28-Jun-2019 |
Satish Balay <balay@mcs.anl.gov> |
Adding vector logging, started matrix logging
|
| #
b6a92dca
|
| 26-Jun-2019 |
BarryFSmith <bsmith@mcs.anl.gov> |
Merged in barry/cuda-multigrid-test (pull request #1763)
Various improvements for GPUs (mostly for performance and CUDA)
|
| #
c56e2027
|
| 26-Jun-2019 |
BarryFSmith <bsmith@mcs.anl.gov> |
Merged in barry/optimize-aij-da (pull request #1762)
Non-numeric optimizations focused on AIJ, MatFDColoring, and DMCreateMatrix_DA_*AIJ, looking to improve performance in GPU environments
|
| #
071fcb05
|
| 05-Jun-2019 |
Barry Smith <bsmith@mcs.anl.gov> |
Non-numeric optimizations focused on AIJ, MatFDColoring, and DMCreateMatrix_DA_*AIJ, looking to improve performance in GPU environments
1) PetscCalloc*() now uses system calloc() 2) Merged some Pets
Non-numeric optimizations focused on AIJ, MatFDColoring, and DMCreateMatrix_DA_*AIJ, looking to improve performance in GPU environments
1) PetscCalloc*() now uses system calloc() 2) Merged some PetscMalloc*() 3) Eliminated unneeded PetscCalloc*() 4) Removed some memory allocations and copies in MatFDColoringSetUp(), added local variables for better compiler optimization 5) Added MatSetValues_SeqAIJ_SortedFull(), added MatSetOption(MAT_SORTED_FULL) 6) Optimized DMCreateMatrix_DA_*AIJ for nonperiodic case to automatically have sorted columns (faster MatSetValues() times) 7) Eliminated call to PetscMemzero() in PetscFree()
Commit-type: style-fix, feature
show more ...
|
| #
fdc842d1
|
| 31-May-2019 |
Barry Smith <bsmith@mcs.anl.gov> |
Various improvements for GPUs (mostly for performance and CUDA)
1) Add VecPinToCPU() for CUDA vector and matrices 2) Move initialization of cuBLAS to PetscInitialize() since it takes 1/2 second and
Various improvements for GPUs (mostly for performance and CUDA)
1) Add VecPinToCPU() for CUDA vector and matrices 2) Move initialization of cuBLAS to PetscInitialize() since it takes 1/2 second and distorts timing with -log_view 3) Add logging for DMCreateMatrix (for large meshes this is very large) 4) Add VecGet/RestoreArrayWrite() to prevent unneeded copies from GPU (only implemented so far for CUDA); added a small number of usages in the source so that snes tutorials ex19 does not do unneeded communication from the GPU 5) Automatically convert MAIJ matrices to AIJ for CUDA since they are not yet supported natively in PETSc's CUDA matrix implementation 6) Pinned objects should still use the CUDA/ViennaCL versions of Destroy to clean up the GPU stuff
Commit-type: feature
show more ...
|
| #
613bfe33
|
| 02-Jun-2019 |
BarryFSmith <bsmith@mcs.anl.gov> |
Merged in barry/update-collective-on (pull request #1744)
Update the use of Collective on in the manual pages to reflect the new style
|
| #
d083f849
|
| 01-Jun-2019 |
Barry Smith <bsmith@mcs.anl.gov> |
Update the use of Collective on in the manual pages to reflect the new style
Commit-type: style-fix, documentation Thanks-to: Patrick Sanan <patrick.sanan@gmail.com>
|
| #
5065da2f
|
| 13-May-2019 |
Barry Smith <bsmith@mcs.anl.gov> |
Merge branch 'master' of bitbucket.org:petsc/petsc
|
| #
4edbe3a6
|
| 12-May-2019 |
Karl Rupp <me@karlrupp.net> |
Merge branch 'barry/feature-pintocpu' [PR #1641]
* barry/feature-pintocpu: Adding a MatPinToCPU() and VecPinToGPU() capability For matrices this will prevent copies to the GPU when they will never b
Merge branch 'barry/feature-pintocpu' [PR #1641]
* barry/feature-pintocpu: Adding a MatPinToCPU() and VecPinToGPU() capability For matrices this will prevent copies to the GPU when they will never be used there. For vectors this will prevent vectors from boucing back and forth between the CPU.
show more ...
|
| #
e7e92044
|
| 07-May-2019 |
Barry Smith <bsmith@mcs.anl.gov> |
Based on discussion with Oana I am adding a MatPinToCPU() and VecPinToGPU() capability. For matrices this will prevent copies to the GPU when they will never be used there. For vectors this will prev
Based on discussion with Oana I am adding a MatPinToCPU() and VecPinToGPU() capability. For matrices this will prevent copies to the GPU when they will never be used there. For vectors this will prevent vectors from boucing back and forth between the CPU and GPU when most of the work is in the CPU. An example of the place that needs to avoid bouncing is in MatFDColoringApply_XXXX()
Commit-type: feature, documentation, example Thanks-to: Oana Marin <oanam@mcs.anl.gov>
show more ...
|
| #
a5a49157
|
| 25-Oct-2018 |
Joseph Pusztay <josephpusztay@Josephs-MacBook-Pro.local> |
Merge branch 'master' into jpusztay/feature-swarm-symplectic-example
|
| #
e901d7f7
|
| 25-Oct-2018 |
Joseph Pusztay <josephpusztay@Josephs-MacBook-Pro.local> |
Merge branch 'master' into jpustay/feature-swarm-example
|
| #
baeaa64e
|
| 25-Oct-2018 |
Joseph Pusztay <josephpu@buffalo.edu> |
Merged petsc/petsc into master
|