Lines Matching refs:gpu
29 - Added non-tensor basis support to code generation backends `/gpu/cuda/gen` and `/gpu/hip/gen`.
30 - Added support to code generation backends `/gpu/cuda/gen` and `/gpu/hip/gen` for operators with b…
57 - Added Sycl backends `/gpu/sycl/ref`, `/gpu/sycl/shared`, and `/gpu/sycl/gen`.
133 - Refactored `/gpu/cuda/shared` and `/gpu/cuda/gen` as well as `/gpu/hip/shared` and `/gpu/hip/gen`…
134 - Enabled support for `p > 8` for `/gpu/*/shared` backends.
148 …nsor basis kernels (and element restriction kernels, in non-deterministic `/gpu/*/magma` backends).
242 - New HIP MAGMA backends for hipMAGMA library users: `/gpu/hip/magma` and `/gpu/hip/magma/det`.
243 - New HIP backends for improved tensor basis performance: `/gpu/hip/shared` and `/gpu/hip/gen`.
273 - New HIP backend: `/gpu/hip/ref`.
287 - The `/gpu/cuda/reg` backend has been removed, with its core features moved into `/gpu/cuda/ref` a…
420 | `/gpu/occa` | CUDA OCCA kernels |
423 | `/gpu/cuda/ref` | Reference pure CUDA kernels |
424 | `/gpu/cuda/reg` | Pure CUDA kernels using one thread per element |
425 | `/gpu/cuda/shared` | Optimized pure CUDA kernels using shared memory |
426 | `/gpu/cuda/gen` | Optimized pure CUDA kernels using code generation |
427 | `/gpu/magma` | CUDA MAGMA kernels |
470 performance. The `/gpu/cuda/*` backends provide GPU performance strictly using CUDA.
471 The `/gpu/cuda/ref` backend is a reference CUDA backend, providing reasonable
472 performance for most problem configurations. The `/gpu/cuda/reg` backend uses a simple
475 backend unroll loops and map memory address to registers. The `/gpu/cuda/reg` backend
500 | `/gpu/occa` | CUDA OCCA kernels |
503 | `/gpu/cuda/ref` | Reference pure CUDA kernels |
504 | `/gpu/cuda/reg` | Pure CUDA kernels using one thread per element |
505 | `/gpu/magma` | CUDA MAGMA kernels |
558 | `/gpu/occa` | CUDA OCCA kernels |
561 | `/gpu/magma` | CUDA MAGMA kernels |
605 | `/gpu/occa` | CUDA OCCA kernels |
608 | `/gpu/magma` | CUDA MAGMA kernels |
648 | `/gpu/occa` | CUDA OCCA kernels |
683 `/gpu/occa`, and `/omp/occa`.
696 | `/gpu/occa` | CUDA OCCA kernels |