Home
last modified time | relevance | path

Searched hist:"023 b8a51676743c24bb03c40af89971dbec6e8fb" (Results 1 – 6 of 6) sorted by relevance

/libCEED/backends/cuda/
H A Dceed-cuda-compile.cpp023b8a51676743c24bb03c40af89971dbec6e8fb Wed Jan 25 03:54:41 UTC 2023 abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com> magma: non-tensor rtc (#1141)

* some refactoring in magma's jit src

* fix path

* fix loading src

* refactor magma nontensor backend

* refactor magma nontensor backend

* [WIP]: new nontensor basis kernels

* [WIP]: new nontensor basis kernels

* [WIP]: new nontensor basis kernels

* call the new nontensor kernels for low order problems

* multiple compilation for the same kernels but with different tuning parmaters

* magma: allow different nb's for different non-tensor kernels

* tuning data for the non-tensor rtc kernels

* remove no-longer used functions, add new one for tuning the nontensor kernels

* constants for tuning

* tuning functions

* use the tuning functions in compiling/running the new kernels

* bug fix

* fixes

* fixes

* minor

* switch tuning data

* fix name

* fix name

* add function to run cuda kernels with opt-in shared memory feature

* minor fix

* minor fix

* fix calls to batch api

* allow more kernel instances

* temporary timing function

* temporary timing function

* tuning data based on hiprtc

* rollback tuning parameters

* fixes

* fixes

* fix inconsistency in the parameters passed to nvrtc/hiprtc

* minor

* a fix to the nb selector

* cleanup

* merge the opt-in feature in CeedRunKernelDimSharedOptinCuda into CeedRunKernelDimSharedCuda

* fix paths for hip-magma backends

* style

* fixes

* running make format

* undo changes from the last commit

* change HIP_DIR to ROCM_DIR and adjust the paths for magma accordingly

* replace HIP_DIR with ROCM_DIR
/libCEED/backends/magma/
H A Dceed-magma.h023b8a51676743c24bb03c40af89971dbec6e8fb Wed Jan 25 03:54:41 UTC 2023 abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com> magma: non-tensor rtc (#1141)

* some refactoring in magma's jit src

* fix path

* fix loading src

* refactor magma nontensor backend

* refactor magma nontensor backend

* [WIP]: new nontensor basis kernels

* [WIP]: new nontensor basis kernels

* [WIP]: new nontensor basis kernels

* call the new nontensor kernels for low order problems

* multiple compilation for the same kernels but with different tuning parmaters

* magma: allow different nb's for different non-tensor kernels

* tuning data for the non-tensor rtc kernels

* remove no-longer used functions, add new one for tuning the nontensor kernels

* constants for tuning

* tuning functions

* use the tuning functions in compiling/running the new kernels

* bug fix

* fixes

* fixes

* minor

* switch tuning data

* fix name

* fix name

* add function to run cuda kernels with opt-in shared memory feature

* minor fix

* minor fix

* fix calls to batch api

* allow more kernel instances

* temporary timing function

* temporary timing function

* tuning data based on hiprtc

* rollback tuning parameters

* fixes

* fixes

* fix inconsistency in the parameters passed to nvrtc/hiprtc

* minor

* a fix to the nb selector

* cleanup

* merge the opt-in feature in CeedRunKernelDimSharedOptinCuda into CeedRunKernelDimSharedCuda

* fix paths for hip-magma backends

* style

* fixes

* running make format

* undo changes from the last commit

* change HIP_DIR to ROCM_DIR and adjust the paths for magma accordingly

* replace HIP_DIR with ROCM_DIR
H A Dceed-magma-basis.c023b8a51676743c24bb03c40af89971dbec6e8fb Wed Jan 25 03:54:41 UTC 2023 abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com> magma: non-tensor rtc (#1141)

* some refactoring in magma's jit src

* fix path

* fix loading src

* refactor magma nontensor backend

* refactor magma nontensor backend

* [WIP]: new nontensor basis kernels

* [WIP]: new nontensor basis kernels

* [WIP]: new nontensor basis kernels

* call the new nontensor kernels for low order problems

* multiple compilation for the same kernels but with different tuning parmaters

* magma: allow different nb's for different non-tensor kernels

* tuning data for the non-tensor rtc kernels

* remove no-longer used functions, add new one for tuning the nontensor kernels

* constants for tuning

* tuning functions

* use the tuning functions in compiling/running the new kernels

* bug fix

* fixes

* fixes

* minor

* switch tuning data

* fix name

* fix name

* add function to run cuda kernels with opt-in shared memory feature

* minor fix

* minor fix

* fix calls to batch api

* allow more kernel instances

* temporary timing function

* temporary timing function

* tuning data based on hiprtc

* rollback tuning parameters

* fixes

* fixes

* fix inconsistency in the parameters passed to nvrtc/hiprtc

* minor

* a fix to the nb selector

* cleanup

* merge the opt-in feature in CeedRunKernelDimSharedOptinCuda into CeedRunKernelDimSharedCuda

* fix paths for hip-magma backends

* style

* fixes

* running make format

* undo changes from the last commit

* change HIP_DIR to ROCM_DIR and adjust the paths for magma accordingly

* replace HIP_DIR with ROCM_DIR
/libCEED/
H A DREADME.md023b8a51676743c24bb03c40af89971dbec6e8fb Wed Jan 25 03:54:41 UTC 2023 abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com> magma: non-tensor rtc (#1141)

* some refactoring in magma's jit src

* fix path

* fix loading src

* refactor magma nontensor backend

* refactor magma nontensor backend

* [WIP]: new nontensor basis kernels

* [WIP]: new nontensor basis kernels

* [WIP]: new nontensor basis kernels

* call the new nontensor kernels for low order problems

* multiple compilation for the same kernels but with different tuning parmaters

* magma: allow different nb's for different non-tensor kernels

* tuning data for the non-tensor rtc kernels

* remove no-longer used functions, add new one for tuning the nontensor kernels

* constants for tuning

* tuning functions

* use the tuning functions in compiling/running the new kernels

* bug fix

* fixes

* fixes

* minor

* switch tuning data

* fix name

* fix name

* add function to run cuda kernels with opt-in shared memory feature

* minor fix

* minor fix

* fix calls to batch api

* allow more kernel instances

* temporary timing function

* temporary timing function

* tuning data based on hiprtc

* rollback tuning parameters

* fixes

* fixes

* fix inconsistency in the parameters passed to nvrtc/hiprtc

* minor

* a fix to the nb selector

* cleanup

* merge the opt-in feature in CeedRunKernelDimSharedOptinCuda into CeedRunKernelDimSharedCuda

* fix paths for hip-magma backends

* style

* fixes

* running make format

* undo changes from the last commit

* change HIP_DIR to ROCM_DIR and adjust the paths for magma accordingly

* replace HIP_DIR with ROCM_DIR
H A D.gitlab-ci.yml023b8a51676743c24bb03c40af89971dbec6e8fb Wed Jan 25 03:54:41 UTC 2023 abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com> magma: non-tensor rtc (#1141)

* some refactoring in magma's jit src

* fix path

* fix loading src

* refactor magma nontensor backend

* refactor magma nontensor backend

* [WIP]: new nontensor basis kernels

* [WIP]: new nontensor basis kernels

* [WIP]: new nontensor basis kernels

* call the new nontensor kernels for low order problems

* multiple compilation for the same kernels but with different tuning parmaters

* magma: allow different nb's for different non-tensor kernels

* tuning data for the non-tensor rtc kernels

* remove no-longer used functions, add new one for tuning the nontensor kernels

* constants for tuning

* tuning functions

* use the tuning functions in compiling/running the new kernels

* bug fix

* fixes

* fixes

* minor

* switch tuning data

* fix name

* fix name

* add function to run cuda kernels with opt-in shared memory feature

* minor fix

* minor fix

* fix calls to batch api

* allow more kernel instances

* temporary timing function

* temporary timing function

* tuning data based on hiprtc

* rollback tuning parameters

* fixes

* fixes

* fix inconsistency in the parameters passed to nvrtc/hiprtc

* minor

* a fix to the nb selector

* cleanup

* merge the opt-in feature in CeedRunKernelDimSharedOptinCuda into CeedRunKernelDimSharedCuda

* fix paths for hip-magma backends

* style

* fixes

* running make format

* undo changes from the last commit

* change HIP_DIR to ROCM_DIR and adjust the paths for magma accordingly

* replace HIP_DIR with ROCM_DIR
H A DMakefile023b8a51676743c24bb03c40af89971dbec6e8fb Wed Jan 25 03:54:41 UTC 2023 abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com> magma: non-tensor rtc (#1141)

* some refactoring in magma's jit src

* fix path

* fix loading src

* refactor magma nontensor backend

* refactor magma nontensor backend

* [WIP]: new nontensor basis kernels

* [WIP]: new nontensor basis kernels

* [WIP]: new nontensor basis kernels

* call the new nontensor kernels for low order problems

* multiple compilation for the same kernels but with different tuning parmaters

* magma: allow different nb's for different non-tensor kernels

* tuning data for the non-tensor rtc kernels

* remove no-longer used functions, add new one for tuning the nontensor kernels

* constants for tuning

* tuning functions

* use the tuning functions in compiling/running the new kernels

* bug fix

* fixes

* fixes

* minor

* switch tuning data

* fix name

* fix name

* add function to run cuda kernels with opt-in shared memory feature

* minor fix

* minor fix

* fix calls to batch api

* allow more kernel instances

* temporary timing function

* temporary timing function

* tuning data based on hiprtc

* rollback tuning parameters

* fixes

* fixes

* fix inconsistency in the parameters passed to nvrtc/hiprtc

* minor

* a fix to the nb selector

* cleanup

* merge the opt-in feature in CeedRunKernelDimSharedOptinCuda into CeedRunKernelDimSharedCuda

* fix paths for hip-magma backends

* style

* fixes

* running make format

* undo changes from the last commit

* change HIP_DIR to ROCM_DIR and adjust the paths for magma accordingly

* replace HIP_DIR with ROCM_DIR