| #
00816365
|
| 08-Jan-2020 |
Junchao Zhang <jczhang@mcs.anl.gov> |
No need to get mtypes in PetscSFXxxEnd(). The cuda call is not cheap.
|
| #
cd620004
|
| 05-Dec-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Refactor SF packing
1) Separate out local communication from remote communication 2) Directly pass root/leafdata to MPI when suitable
|
| #
203a8786
|
| 29-Nov-2019 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jczhang/feature-sf-do-pack-on-gpu' into 'master'
Add support to do pack/unpack on GPU and do MPI on CPU
See merge request petsc/petsc!2205
|
| #
51ccb202
|
| 05-Nov-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Add an option -sf_use_pinned_buffer to use non-pagable host memory for send/recv buffer when passing GPU data
|
| #
0e0e0189
|
| 23-Oct-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Log transfer between CPU and GPU
|
| #
855db38d
|
| 16-Oct-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Support data on device but no gpu-aware MPI for sf(all)gather(v)
|
| #
b7c0d12a
|
| 10-Oct-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Renamed PetscSFPackWaitall_Basic to PetscSFPackWaitall
|
| #
120a1823
|
| 10-Oct-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Add option that lets users do packing on GPU and do MPI on CPU
|
| #
893c5908
|
| 30-Oct-2019 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'maint'
|
| #
7033dd2d
|
| 30-Oct-2019 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jed/fix-destroy-spelling' into 'maint'
Spelling: Destory -> Destroy
See merge request petsc/petsc!2231
|
| #
64f49bab
|
| 28-Oct-2019 |
Jed Brown <jed@jedbrown.org> |
Spelling: Destory -> Destroy
Commit-type: style-fix
|
| #
c1acdb04
|
| 28-Sep-2019 |
Satish Balay <balay@mcs.anl.gov> |
Merge remote-tracking branch 'origin/jczhang/feature-sf-on-gpu'
Add GPU-aware VecScatter/PetscSF
See merge request petsc/petsc!1995
|
| #
eb02082b
|
| 25-Sep-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Added SF GPU support
|
| #
b23bfdef
|
| 13-Aug-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Update pack/unpack routines to do packing/unpacking for all neighbors in at most two routines
One is used to pack data in self to self communication; The second is used for remote communication. So
Update pack/unpack routines to do packing/unpacking for all neighbors in at most two routines
One is used to pack data in self to self communication; The second is used for remote communication. So that on GPU, we can use at most two kernels to do packing/packing for all neighbors instead of multiple kernels
show more ...
|
| #
05393080
|
| 25-Jul-2019 |
Karl Rupp <me@karlrupp.net> |
Merge branch 'jczhang/sf-more-opts' [PR #1567]
* jczhang/sf-more-opts: Add more optimizations in SF and use it as the default for VecScatter.
|
| #
f659e5c7
|
| 25-Jun-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Optimize the creation of embedded SFs for SFBasic
|
| #
40e23c03
|
| 25-Jun-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Simplify packing routines with macros and optimize packing by using index patterns
|