add hook for CPU cusparse solves
adding support for multiple grids with multiple species per grid
added relativistic terms to 3D Landau
Optimizations for mass term in Landau
MatSetValuesDevice: Cleanup and simplify code, including exampleUser reported crash of example code. Kernel was passed an ierr that lived in CPU memoryMatSetValuesDevice: do not include private h
MatSetValuesDevice: Cleanup and simplify code, including exampleUser reported crash of example code. Kernel was passed an ierr that lived in CPU memoryMatSetValuesDevice: do not include private headers from public headersFeature: MatSetValuesDevice determines automatically from the context (where it is included from) if it is being used from C, CUDA, or Kokkos, PETSC_DEVICE_FUNC_DEC no longer needs to be set before including petscaijdevice.hFeature: MatSetValuesDevice() now ignores all values outside the global column range.PetscSplitCSRDataStructure is now a pointer, not a struct, like most PETSc objects, please leave it that way.Fix all uses of CTABLE that were related to the original MatSetValuesDevice()Have atomicAdd use Kokkos atomic-add with CPU build when building with Kokkos.Cuda should now work with --download-openmpi, this is done by updating updateCompilers() to rerun portions of packages/cuda.py after the compilers are reset to use MPI wrappers. This is needed because the resetting of the compilers removes all the compiler flags and packages/cuda.py sets certain values into these flags that was previously lost.Add MPICXX_INCLUDES, MPICXX_LIBS to fix compile targets for Kokkos examples'make check' now runs properly for Kokkos test of src/snes/ex3k, fixed bug in the makefile wrt MPI_IS_MPIUNI checkTesting makefile rules: add ex*cu binaries to clean ruleReported-by: Sam Fagbemi <samkorede24@gmail.com>Thanks-to: Stefano Zampini <stefano.zampini@gmail.com>Thanks-to: Mark Adams <mfadams@lbl.gov>/spend 16h
show more ...
update tests
Adams/landau kokkos opt
Add cuSparse Band LU factorization
Adams/landau ex2re cleanup- fixed flops counts and warning w/o logging- fix mat_view for pure Jacobian and with mass- modify RE model and add cuda test
Adams/landau cleanup Added mass matrix construction to GPU kernel to avoid problem with sparser mass matrix than Jacobian, messing up MatAXPY, rarely but unpredictably. Added Kokkos solver st
Adams/landau cleanup Added mass matrix construction to GPU kernel to avoid problem with sparser mass matrix than Jacobian, messing up MatAXPY, rarely but unpredictably. Added Kokkos solver stubs to work with GPU offloaded matrices. Fixed up ex2 for paper Added runex2_[kokkos|cuda] targets to Landau makefile for paper
Adding Cuda and Kokkos assembly. Added Device assembly to Landau operator. Added Kokkos test mat/ex5k.
Add a Landau collision operator, based on DMPlex and PetscFE, that uses p4est. It uses the new Kokkos interface and has a separate Cuda implementation. This could be deployed as 3rd party library, bu
Add a Landau collision operator, based on DMPlex and PetscFE, that uses p4est. It uses the new Kokkos interface and has a separate Cuda implementation. This could be deployed as 3rd party library, but this is easier to deploy to ECP, among other apps.
12