| #
934c28dd
|
| 22-Jul-2025 |
Satish Balay <balay@mcs.anl.gov> |
Merge remote-tracking branch 'origin/release'
|
| #
09117800
|
| 22-Jul-2025 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'zach/fixes-gpu-mi300a' into 'release'
MATHYPRE and Kokkos Fixes
See merge request petsc/petsc!8510
|
| #
f3d3cd90
|
| 09-Jul-2025 |
Zach Atkins <Zach.Atkins@colorado.edu> |
Split KokkosDualViewSync into Host and Device versions
|
| #
fabba767
|
| 01-Jul-2025 |
Zach Atkins <Zach.Atkins@colorado.edu> |
Kokkos - Limit excess synchronization on MI300A
|
| #
09b68a49
|
| 04-Apr-2025 |
Satish Balay <balay@mcs.anl.gov> |
Merge remote-tracking branch 'origin/release'
|
| #
e80aff1c
|
| 03-Apr-2025 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jolivet/fix-petsc-case' into 'release'
Fix wrong case for PETSc
See merge request petsc/petsc!8266
|
| #
f0b74427
|
| 01-Apr-2025 |
Pierre Jolivet <pierre@joliv.et> |
Fix wrong case for PETSc
|
| #
afb41d4c
|
| 28-Mar-2025 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jczhang/2025-03-18/revise-aijkokkos-matsolve' into 'main'
Add options to do factorization and solve on host for matseqaijkokkos
See merge request petsc/petsc!8209
|
| #
5d8d5924
|
| 18-Mar-2025 |
Junchao Zhang <jczhang@anl.gov> |
Kokkos: remove PetscLogGpuTimeBegin/End as it easily causes consecutive calls to PetscLogGpuTimeBegin
|
| #
067403a5
|
| 18-Mar-2025 |
Junchao Zhang <jczhang@anl.gov> |
Kokkos: move KokkosDualViewSync() to a private header
|
| #
b7b2c57c
|
| 05-Feb-2025 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jczhang/2025-01-30/feature-support-AMD-MI300A' into 'main'
Add support of AMD MI300A
Closes #1703
See merge request petsc/petsc!8110
|
| #
45402d8a
|
| 30-Jan-2025 |
Junchao Zhang <jczhang@anl.gov> |
Kokkos: add support of AMD MI300A
* Use HostMirrorMemorySpace instead of HostSpace to fix compile errors on MI300A
* Replace Kokkos::HostSpace with HostMirrorMemorySpace to fix compile errors on MI
Kokkos: add support of AMD MI300A
* Use HostMirrorMemorySpace instead of HostSpace to fix compile errors on MI300A
* Replace Kokkos::HostSpace with HostMirrorMemorySpace to fix compile errors on MI300A, since the latter is what Kokkos::DualView use for its host view
* Fix a subtle bug in KokkosDualViewSync() w.r.t MI300A. Suppose we want to sync a petsc VecKokkos v on host. On MI300A, the host copy v_h and the device copy v_d share the memory. So in the old code, we used if (v_dual.need_sync_host()) to skip the device to host memory copy. But I should not skip the exec.fence(). As the device might still have kernels writing v_d, we still need to sync the device/stream to make v_d ready for use on CPU (via v_h).
show more ...
|
| #
314ab5fd
|
| 22-Dec-2023 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'barry/2023-06-07/optimize-multivecs-zhang' into 'main'
Optimize VecMDot_Seq as suggested by Junchao Zhang using BLAS 2 gemv
See merge request petsc/petsc!6580
|
| #
e907feaa
|
| 19-Dec-2023 |
Junchao Zhang <jczhang@anl.gov> |
Vec: add GEMV optimizations for VecMDot and friends for VecKokkos
|