| #
fcc7397d
|
| 21-Jan-2020 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Use a 3d submatrix to describle indices in packing
|
| #
f01131f0
|
| 08-Jan-2020 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Change PetscMemcpyWithMemType to PetscSFLinkMemcpy and make it asynchronous
With that, all cuda calls in SF are asynchronous and work on link->stream. The reason is we want to avoid sudden 'join poi
Change PetscMemcpyWithMemType to PetscSFLinkMemcpy and make it asynchronous
With that, all cuda calls in SF are asynchronous and work on link->stream. The reason is we want to avoid sudden 'join points' caused by synchronous cuda calls.
show more ...
|
| #
cd620004
|
| 05-Dec-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Refactor SF packing
1) Separate out local communication from remote communication 2) Directly pass root/leafdata to MPI when suitable
|
| #
203a8786
|
| 29-Nov-2019 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jczhang/feature-sf-do-pack-on-gpu' into 'master'
Add support to do pack/unpack on GPU and do MPI on CPU
See merge request petsc/petsc!2205
|
| #
51ccb202
|
| 05-Nov-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Add an option -sf_use_pinned_buffer to use non-pagable host memory for send/recv buffer when passing GPU data
|
| #
e315309d
|
| 23-Oct-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Fix a performance bug
|
| #
e07844bf
|
| 16-Oct-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Clarify a comment
|
| #
637e6665
|
| 16-Oct-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Rename rkey, lkey to rootdata, leafdata
|
| #
b7c0d12a
|
| 10-Oct-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Renamed PetscSFPackWaitall_Basic to PetscSFPackWaitall
|
| #
893c5908
|
| 30-Oct-2019 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'maint'
|
| #
7033dd2d
|
| 30-Oct-2019 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jed/fix-destroy-spelling' into 'maint'
Spelling: Destory -> Destroy
See merge request petsc/petsc!2231
|
| #
64f49bab
|
| 28-Oct-2019 |
Jed Brown <jed@jedbrown.org> |
Spelling: Destory -> Destroy
Commit-type: style-fix
|
| #
71f2a993
|
| 16-Oct-2019 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'stefanozampini/fix-sfcompose' into 'master'
Allow sparse leaves in SFCompose operations
See merge request petsc/petsc!2164
|
| #
5ad15460
|
| 11-Oct-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
SF: treat PETSc builtin datatypes the same way as MPI builtin datatypes
So that we won't duplicate PETSc builtin datatypes like MPIU_2INT. Otherwise, we can not simply use "if (type != MPIU_2INT) SE
SF: treat PETSc builtin datatypes the same way as MPI builtin datatypes
So that we won't duplicate PETSc builtin datatypes like MPIU_2INT. Otherwise, we can not simply use "if (type != MPIU_2INT) SETERRQ()" in code.
Add support to unwrap a type created by MPI_Type_contiguous(1,..) Let dumb types use their own type
show more ...
|
| #
c1acdb04
|
| 28-Sep-2019 |
Satish Balay <balay@mcs.anl.gov> |
Merge remote-tracking branch 'origin/jczhang/feature-sf-on-gpu'
Add GPU-aware VecScatter/PetscSF
See merge request petsc/petsc!1995
|
| #
eb02082b
|
| 25-Sep-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Added SF GPU support
|
| #
b23bfdef
|
| 13-Aug-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Update pack/unpack routines to do packing/unpacking for all neighbors in at most two routines
One is used to pack data in self to self communication; The second is used for remote communication. So
Update pack/unpack routines to do packing/unpacking for all neighbors in at most two routines
One is used to pack data in self to self communication; The second is used for remote communication. So that on GPU, we can use at most two kernels to do packing/packing for all neighbors instead of multiple kernels
show more ...
|
| #
1b085a39
|
| 29-Jul-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Code style change
|
| #
05393080
|
| 25-Jul-2019 |
Karl Rupp <me@karlrupp.net> |
Merge branch 'jczhang/sf-more-opts' [PR #1567]
* jczhang/sf-more-opts: Add more optimizations in SF and use it as the default for VecScatter.
|
| #
9d1c8add
|
| 23-Jul-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
SF: Partially fix a bug when overlapped SF communications have same rootdata or leafdata on some ranks
Now we use two keys (rootdata, leafdata) to identify a pending communication. But that is still
SF: Partially fix a bug when overlapped SF communications have same rootdata or leafdata on some ranks
Now we use two keys (rootdata, leafdata) to identify a pending communication. But that is still not enough for cases where communications have same rootdata and leafdata on some ranks. Currently We error out on these cases. See src/vec/is/sf/examples/tutorials/ex2.c for various cases we can handle and we can not handle.
show more ...
|
| #
40e23c03
|
| 25-Jun-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Simplify packing routines with macros and optimize packing by using index patterns
|