History log of /libCEED/rust/libceed-sys/c-src/backends/cuda-gen/ceed-cuda-gen-operator.c (Results 126 – 134 of 134)
Revision Date Author Comments
# 5107b09f 18-Nov-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Delegate AssembleLinearQF to ref/serial when not impl (#406)

* Operator - delegate AssembleLinearQF to ref/serial when not impl by backend

* Occa - Fix restriction summing error

* Tests - fix

Delegate AssembleLinearQF to ref/serial when not impl (#406)

* Operator - delegate AssembleLinearQF to ref/serial when not impl by backend

* Occa - Fix restriction summing error

* Tests - fix error in t534-f qfunction for CPU

* make style

* Operator - clarify fallback mechanism, allow backends to provide fallback other than /cpu/self/ref/serial

* Operator - update fallback to avoid copying vectors, restrictions

* Operator - move fallback to ceed level

* Operator - explicitly check for falling back to onself

* Update interface/ceed-operator.c

Co-Authored-By: Jed Brown <jed@jedbrown.org>

show more ...


# 7af48cf9 17-Nov-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Merge pull request #417 from CEED/jeremy/none-args

None Args


# a7b7f929 16-Nov-2019 jeremylt <jeremy.thompson@colorado.edu>

Basis - Use CEED_VECTOR_NONE for EVAL_MODE_WEIGHT


# e6a04bf5 16-Oct-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Merge pull request #273 from CEED/p-multigrid

P Multigrid Example


# 7f823360 16-Oct-2019 jeremylt <jeremy.thompson@colorado.edu>

Make style


# 1d102b48 03-Oct-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Assemble Linear QFunction (#306)

* Operator - add interface for AssembleLinearQF

* Operator - Refactor Ref/Blocked/Opt basis apply

* Operator - Add AssembleLinearQF impl for Ref/Blocked/Opt, n

Assemble Linear QFunction (#306)

* Operator - add interface for AssembleLinearQF

* Operator - Refactor Ref/Blocked/Opt basis apply

* Operator - Add AssembleLinearQF impl for Ref/Blocked/Opt, not impl message for OCCA/CUDA

* AssembleQF - Add grad test and clean up code

* CPU - Add operator eval mode error, remove opt inlineing

* Operator - clarify QF assemble documentation, style updates

* Interface - style and consistency updates

* Tests - add more complex assembled qfunction test

* Tests - add fortran test for assemble linear qfunction

* Tests - Update t53* tests for new Fortran source macro

* Merge - small fixes

* Operator - convert to backend creating assembled qdata vector

* Operator - zero qvecs before using in assembly of qf

* Operator - expand assemble QF documentation

* CPU - minor fix in AssembleLineaorQF to prevent uninitalized memory

* Tests - fix wording in t531, t532

show more ...


# ac421f39 17-Sep-2019 Yohann <dudouit1@llnl.gov>

Improved performance of cuda-gen backend (#341)

Thanks-to: Tim Warburton
Some of these optimizations are the results of the knowledge and experience gathered by Tim Warburton and his team in libPar

Improved performance of cuda-gen backend (#341)

Thanks-to: Tim Warburton
Some of these optimizations are the results of the knowledge and experience gathered by Tim Warburton and his team in libParanumal and then ported to libCEED.

* Add colocated gradient in 3D.

* Treat the qFunction by slice in 3d to avoid using too many registers.

* Minor fix

* Minor fix.

* Minor fix

* Compute the colocated gradient slice by slice.

* Add synchthreads after initialization of the matrices.

* Remove code print.

* Add a critical #pragma unroll

* Fix typo on "collocated".

* Remove dead code.

* Use ColloGrad3d functions.

* Fix cuda-gen backend when collocated gradient is not available.

* make style

* make style

* Add some comments.

* Replace int by CeedInt.

show more ...


# 288c0443 13-Sep-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

QFunction Create by Name (#311)

This PR adds a QFunction gallery to libCEED with 1D, 2D, and 3D mass and Poisson operators.

Closes issue #37, issue #340

* Add QFunction gallery, rename focca

QFunction Create by Name (#311)

This PR adds a QFunction gallery to libCEED with 1D, 2D, and 3D mass and Poisson operators.

Closes issue #37, issue #340

* Add QFunction gallery, rename focca

* Gallery - add initial QFunctions

* Add a test for using the QF gallery

* Modify ex1 to use gallery

* Add multiple test configs to tap

* Move output to test directory

* Update junit

* Add OCCA galley exception

* Add ex2

* Update ex2 for dim->ncompx

* Gallery - modify to work for CUDA as is

* Update Documentation

* Gallery - typo fix

* Gallery - convention change, postappend qfunction family variant

* Gallery - update template with new name checking convention

* Gallery - condense diff3DBuild QFunction

* Gallery - rename diff -> poisson

* Gallery - clarify poisson3DBuild comment

* Gallery - use Pragma SIMD, store Qdata in Voigt convention

* Examples - Convert BP3-6 to Voigt convention

* Examples - add cl option to switch between header and gallery qfs in CEED examples

* Examples - clean up construction of QF name

* Gallery - Switch to PascalCase for gallery names

* Doc - fix function type page

* Interface - Make sure strncpy result is null terminated

* Gallery - Update Poisson 2/3D Apply to new QF body

* make style

* make style - fix worst style problems

* make style - add gallery to make style

* Doc - update documentation errors and inconsistencies

* Examples - test ex1 ex2 with and without gallary

* Examples - reduce testing of ex1/ex2 without gallery, clean up non-gallery qfunctions

* MFEM - revert another make style mistake

* Manual make style updates

* Doc - update function documentation page

* Style updates, document test numbering conventions

* doc: resolve ambiguous image location warning, allow more Dot nodes

* Tests - style and cast cleanup

* Tests - fix README indentation

show more ...


# 241a4b83 25-Jul-2019 Yohann <yohann.dudouit@gmail.com>

Full jit compiled operator: cuda-gen backend (#275)

* First steps toward cuda-gen backend!

* Closer to real code generation.

* Generated code should be ready for nvrtc.

* The code generatio

Full jit compiled operator: cuda-gen backend (#275)

* First steps toward cuda-gen backend!

* Closer to real code generation.

* Generated code should be ready for nvrtc.

* The code generation skeleton is ready.

* Hack with the qfunction to make the operator kernel compile.

* Some tweaks in the makefile + Input fields structure change.

* Remove using cout.

* 1d interp and grad device functions.

* 1d readDofs, readQuads, writeDofs, writeQuads.

* Remove dead code.

* readDofs, readQuads, writeDofs, writeQuads for 2d and 3d

* 2d interp and grad

* 3d interp and grad

* - weight functions for 1d,2d,3d
- link the indices to the kernel
- link the fields to the kernel
- link the basis to the kernel

* Add the qFunction reader + inlining

* Add qf files for the tests.

* Add qf file for ceed/ex1

* Add qf file for mfem/bp1

* All tests pass.

* Add qFunction for mfem/bp3, petsc/bp1, and petsc/bp3.

* mfem/bp1 passes + remove dead code

* Fix a bug in n_quads_out for writeQuads

* mfem/bp3 passes.

* All tests all examples pass.

* Temporary tweaks for mfem benchmarking

* Add Context management.

* Modify .qf files to take into account the context.

* Enable optimizations.

* First set of optimization for 2D and 3D.

* Makefile tweaks and destructor code.

* make style.

* Add -MP flag.

* Fix linking issues with the tests.

* Update .qf files for the tests.

* Add .qf files for nek5000 examples.

* Use shared memory for B and G matrices.

* Fix bug introduced in previous commit.

show more ...


123456