xref: /libCEED/README.md (revision 0be03a92683d319639505fd4b3dce80b3bae318f)
1bcb2dfaeSJed Brown# libCEED: Efficient Extensible Discretization
2bcb2dfaeSJed Brown
3d3fde3fbSJed Brown[![GitHub Actions][github-badge]][github-link]
4d3fde3fbSJed Brown[![GitLab-CI][gitlab-badge]][gitlab-link]
5d3fde3fbSJed Brown[![Code coverage][codecov-badge]][codecov-link]
6d3fde3fbSJed Brown[![BSD-2-Clause][license-badge]][license-link]
7d3fde3fbSJed Brown[![Documentation][doc-badge]][doc-link]
8d3fde3fbSJed Brown[![JOSS paper][joss-badge]][joss-link]
9d3fde3fbSJed Brown[![Binder][binder-badge]][binder-link]
10bcb2dfaeSJed Brown
11bcb2dfaeSJed Brown## Summary and Purpose
12bcb2dfaeSJed Brown
1317be3a41SJeremy L ThompsonlibCEED provides fast algebra for element-based discretizations, designed for performance portability, run-time flexibility, and clean embedding in higher level libraries and applications.
1417be3a41SJeremy L ThompsonIt offers a C99 interface as well as bindings for Fortran, Python, Julia, and Rust.
1517be3a41SJeremy L ThompsonWhile our focus is on high-order finite elements, the approach is mostly algebraic and thus applicable to other discretizations in factored form, as explained in the [user manual](https://libceed.org/en/latest/) and API implementation portion of the [documentation](https://libceed.org/en/latest/api/).
16bcb2dfaeSJed Brown
1717be3a41SJeremy L ThompsonOne of the challenges with high-order methods is that a global sparse matrix is no longer a good representation of a high-order linear operator, both with respect to the FLOPs needed for its evaluation, as well as the memory transfer needed for a matvec.
1817be3a41SJeremy L ThompsonThus, high-order methods require a new "format" that still represents a linear (or more generally non-linear) operator, but not through a sparse matrix.
19bcb2dfaeSJed Brown
2017be3a41SJeremy L ThompsonThe goal of libCEED is to propose such a format, as well as supporting implementations and data structures, that enable efficient operator evaluation on a variety of computational device types (CPUs, GPUs, etc.).
2117be3a41SJeremy L ThompsonThis new operator description is based on algebraically [factored form](https://libceed.org/en/latest/libCEEDapi/#finite-element-operator-decomposition), which is easy to incorporate in a wide variety of applications, without significant refactoring of their own discretization infrastructure.
22bcb2dfaeSJed Brown
2317be3a41SJeremy L ThompsonThe repository is part of the [CEED software suite](http://ceed.exascaleproject.org/software/), a collection of software benchmarks, miniapps, libraries and APIs for efficient exascale discretizations based on high-order finite element and spectral element methods.
24bcb2dfaeSJed BrownSee <http://github.com/ceed> for more information and source code availability.
25bcb2dfaeSJed Brown
2617be3a41SJeremy L ThompsonThe CEED research is supported by the [Exascale Computing Project](https://exascaleproject.org/exascale-computing-project) (17-SC-20-SC), a collaborative effort of two U.S. Department of Energy organizations (Office of Science and the National Nuclear Security Administration) responsible for the planning and preparation of a [capable exascale ecosystem](https://exascaleproject.org/what-is-exascale), including software, applications, hardware, advanced system engineering and early testbed platforms, in support of the nation’s exascale computing imperative.
27bcb2dfaeSJed Brown
2813964f07SJed BrownFor more details on the CEED API see the [user manual](https://libceed.org/en/latest/).
29bcb2dfaeSJed Brown
30bcb2dfaeSJed Brown% gettingstarted-inclusion-marker
31bcb2dfaeSJed Brown
32bcb2dfaeSJed Brown## Building
33bcb2dfaeSJed Brown
3417be3a41SJeremy L ThompsonThe CEED library, `libceed`, is a C99 library with no required dependencies, and with Fortran, Python, Julia, and Rust interfaces.
3517be3a41SJeremy L ThompsonIt can be built using:
36bcb2dfaeSJed Brown
37bcb2dfaeSJed Brown```
38bcb2dfaeSJed Brownmake
39bcb2dfaeSJed Brown```
40bcb2dfaeSJed Brown
41bcb2dfaeSJed Brownor, with optimization flags:
42bcb2dfaeSJed Brown
43bcb2dfaeSJed Brown```
44bcb2dfaeSJed Brownmake OPT='-O3 -march=skylake-avx512 -ffp-contract=fast'
45bcb2dfaeSJed Brown```
46bcb2dfaeSJed Brown
4717be3a41SJeremy L ThompsonThese optimization flags are used by all languages (C, C++, Fortran) and this makefile variable can also be set for testing and examples (below).
48bcb2dfaeSJed Brown
4917be3a41SJeremy L ThompsonThe library attempts to automatically detect support for the AVX instruction set using gcc-style compiler options for the host.
50bcb2dfaeSJed BrownSupport may need to be manually specified via:
51bcb2dfaeSJed Brown
52bcb2dfaeSJed Brown```
53bcb2dfaeSJed Brownmake AVX=1
54bcb2dfaeSJed Brown```
55bcb2dfaeSJed Brown
56bcb2dfaeSJed Brownor:
57bcb2dfaeSJed Brown
58bcb2dfaeSJed Brown```
59bcb2dfaeSJed Brownmake AVX=0
60bcb2dfaeSJed Brown```
61bcb2dfaeSJed Brown
6217be3a41SJeremy L Thompsonif your compiler does not support gcc-style options, if you are cross compiling, etc.
63bcb2dfaeSJed Brown
6417be3a41SJeremy L ThompsonTo enable CUDA support, add `CUDA_DIR=/opt/cuda` or an appropriate directory to your `make` invocation.
6517be3a41SJeremy L ThompsonTo enable HIP support, add `HIP_DIR=/opt/rocm` or an appropriate directory.
6617be3a41SJeremy L ThompsonTo store these or other arguments as defaults for future invocations of `make`, use:
67bcb2dfaeSJed Brown
68bcb2dfaeSJed Brown```
69bcb2dfaeSJed Brownmake configure CUDA_DIR=/usr/local/cuda HIP_DIR=/opt/rocm OPT='-O3 -march=znver2'
70bcb2dfaeSJed Brown```
71bcb2dfaeSJed Brown
72bcb2dfaeSJed Brownwhich stores these variables in `config.mk`.
73bcb2dfaeSJed Brown
74bcb2dfaeSJed Brown## Additional Language Interfaces
75bcb2dfaeSJed Brown
76bcb2dfaeSJed BrownThe Fortran interface is built alongside the library automatically.
77bcb2dfaeSJed Brown
78bcb2dfaeSJed BrownPython users can install using:
79bcb2dfaeSJed Brown
80bcb2dfaeSJed Brown```
81bcb2dfaeSJed Brownpip install libceed
82bcb2dfaeSJed Brown```
83bcb2dfaeSJed Brown
84bcb2dfaeSJed Brownor in a clone of the repository via `pip install .`.
85bcb2dfaeSJed Brown
86bcb2dfaeSJed BrownJulia users can install using:
87bcb2dfaeSJed Brown
88bcb2dfaeSJed Brown```
89bcb2dfaeSJed Brown$ julia
90bcb2dfaeSJed Brownjulia> ]
91bcb2dfaeSJed Brownpkg> add LibCEED
92bcb2dfaeSJed Brown```
93bcb2dfaeSJed Brown
9417be3a41SJeremy L ThompsonSee the [LibCEED.jl documentation](http://ceed.exascaleproject.org/libCEED-julia-docs/dev/) for more information.
95bcb2dfaeSJed Brown
96bcb2dfaeSJed BrownRust users can include libCEED via `Cargo.toml`:
97bcb2dfaeSJed Brown
98bcb2dfaeSJed Brown```toml
99bcb2dfaeSJed Brown[dependencies]
100bcb2dfaeSJed Brownlibceed = { git = "https://github.com/CEED/libCEED", branch = "main" }
101bcb2dfaeSJed Brown```
102bcb2dfaeSJed Brown
103bcb2dfaeSJed BrownSee the [Cargo documentation](https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#specifying-dependencies-from-git-repositories) for details.
104bcb2dfaeSJed Brown
105bcb2dfaeSJed Brown## Testing
106bcb2dfaeSJed Brown
107bcb2dfaeSJed BrownThe test suite produces [TAP](https://testanything.org) output and is run by:
108bcb2dfaeSJed Brown
109bcb2dfaeSJed Brown```
110bcb2dfaeSJed Brownmake test
111bcb2dfaeSJed Brown```
112bcb2dfaeSJed Brown
113bcb2dfaeSJed Brownor, using the `prove` tool distributed with Perl (recommended):
114bcb2dfaeSJed Brown
115bcb2dfaeSJed Brown```
116bcb2dfaeSJed Brownmake prove
117bcb2dfaeSJed Brown```
118bcb2dfaeSJed Brown
119bcb2dfaeSJed Brown## Backends
120bcb2dfaeSJed Brown
121bcb2dfaeSJed BrownThere are multiple supported backends, which can be selected at runtime in the examples:
122bcb2dfaeSJed Brown
123bcb2dfaeSJed Brown| CEED resource              | Backend                                           | Deterministic Capable |
124d3fde3fbSJed Brown| :---                       | :---                                              | :---:                 |
125d3fde3fbSJed Brown||
126d3fde3fbSJed Brown| **CPU Native**             |
127d3fde3fbSJed Brown| `/cpu/self/ref/serial`     | Serial reference implementation                   | Yes                   |
128d3fde3fbSJed Brown| `/cpu/self/ref/blocked`    | Blocked reference implementation                  | Yes                   |
129d3fde3fbSJed Brown| `/cpu/self/opt/serial`     | Serial optimized C implementation                 | Yes                   |
130d3fde3fbSJed Brown| `/cpu/self/opt/blocked`    | Blocked optimized C implementation                | Yes                   |
131d3fde3fbSJed Brown| `/cpu/self/avx/serial`     | Serial AVX implementation                         | Yes                   |
132d3fde3fbSJed Brown| `/cpu/self/avx/blocked`    | Blocked AVX implementation                        | Yes                   |
133d3fde3fbSJed Brown||
134d3fde3fbSJed Brown| **CPU Valgrind**           |
135d3fde3fbSJed Brown| `/cpu/self/memcheck/*`     | Memcheck backends, undefined value checks         | Yes                   |
136d3fde3fbSJed Brown||
137d3fde3fbSJed Brown| **CPU LIBXSMM**            |
138d3fde3fbSJed Brown| `/cpu/self/xsmm/serial`    | Serial LIBXSMM implementation                     | Yes                   |
139d3fde3fbSJed Brown| `/cpu/self/xsmm/blocked`   | Blocked LIBXSMM implementation                    | Yes                   |
140d3fde3fbSJed Brown||
141d3fde3fbSJed Brown| **CUDA Native**            |
142d3fde3fbSJed Brown| `/gpu/cuda/ref`            | Reference pure CUDA kernels                       | Yes                   |
143d3fde3fbSJed Brown| `/gpu/cuda/shared`         | Optimized pure CUDA kernels using shared memory   | Yes                   |
144d3fde3fbSJed Brown| `/gpu/cuda/gen`            | Optimized pure CUDA kernels using code generation | No                    |
145d3fde3fbSJed Brown||
146d3fde3fbSJed Brown| **HIP Native**             |
147d3fde3fbSJed Brown| `/gpu/hip/ref`             | Reference pure HIP kernels                        | Yes                   |
148d3fde3fbSJed Brown| `/gpu/hip/shared`          | Optimized pure HIP kernels using shared memory    | Yes                   |
149d3fde3fbSJed Brown| `/gpu/hip/gen`             | Optimized pure HIP kernels using code generation  | No                    |
150d3fde3fbSJed Brown||
151d3fde3fbSJed Brown| **MAGMA**                  |
152d3fde3fbSJed Brown| `/gpu/cuda/magma`          | CUDA MAGMA kernels                                | No                    |
153d3fde3fbSJed Brown| `/gpu/cuda/magma/det`      | CUDA MAGMA kernels                                | Yes                   |
154d3fde3fbSJed Brown| `/gpu/hip/magma`           | HIP MAGMA kernels                                 | No                    |
155d3fde3fbSJed Brown| `/gpu/hip/magma/det`       | HIP MAGMA kernels                                 | Yes                   |
156d3fde3fbSJed Brown||
157d3fde3fbSJed Brown| **OCCA**                   |
158d3fde3fbSJed Brown| `/*/occa`                  | Selects backend based on available OCCA modes     | Yes                   |
159d3fde3fbSJed Brown| `/cpu/self/occa`           | OCCA backend with serial CPU kernels              | Yes                   |
160d3fde3fbSJed Brown| `/cpu/openmp/occa`         | OCCA backend with OpenMP kernels                  | Yes                   |
161*0be03a92SJeremy L Thompson| `/cpu/dpcpp/occa`          | OCCA backend with CPC++ kernels                   | Yes                   |
162d3fde3fbSJed Brown| `/gpu/cuda/occa`           | OCCA backend with CUDA kernels                    | Yes                   |
163d3fde3fbSJed Brown| `/gpu/hip/occa`~           | OCCA backend with HIP kernels                     | Yes                   |
164bcb2dfaeSJed Brown
16517be3a41SJeremy L ThompsonThe `/cpu/self/*/serial` backends process one element at a time and are intended for meshes with a smaller number of high order elements.
16617be3a41SJeremy L ThompsonThe `/cpu/self/*/blocked` backends process blocked batches of eight interlaced elements and are intended for meshes with higher numbers of elements.
167bcb2dfaeSJed Brown
168bcb2dfaeSJed BrownThe `/cpu/self/ref/*` backends are written in pure C and provide basic functionality.
169bcb2dfaeSJed Brown
170bcb2dfaeSJed BrownThe `/cpu/self/opt/*` backends are written in pure C and use partial e-vectors to improve performance.
171bcb2dfaeSJed Brown
172bcb2dfaeSJed BrownThe `/cpu/self/avx/*` backends rely upon AVX instructions to provide vectorized CPU performance.
173bcb2dfaeSJed Brown
17417be3a41SJeremy L ThompsonThe `/cpu/self/memcheck/*` backends rely upon the [Valgrind](http://valgrind.org/) Memcheck tool to help verify that user QFunctions have no undefined values.
17517be3a41SJeremy L ThompsonTo use, run your code with Valgrind and the Memcheck backends, e.g. `valgrind ./build/ex1 -ceed /cpu/self/ref/memcheck`.
17617be3a41SJeremy L ThompsonA 'development' or 'debugging' version of Valgrind with headers is required to use this backend.
17717be3a41SJeremy L ThompsonThis backend can be run in serial or blocked mode and defaults to running in the serial mode if `/cpu/self/memcheck` is selected at runtime.
178bcb2dfaeSJed Brown
17917be3a41SJeremy L ThompsonThe `/cpu/self/xsmm/*` backends rely upon the [LIBXSMM](http://github.com/hfp/libxsmm) package to provide vectorized CPU performance.
18017be3a41SJeremy L ThompsonIf linking MKL and LIBXSMM is desired but the Makefile is not detecting `MKLROOT`, linking libCEED against MKL can be forced by setting the environment variable `MKL=1`.
181bcb2dfaeSJed Brown
182bcb2dfaeSJed BrownThe `/gpu/cuda/*` backends provide GPU performance strictly using CUDA.
183bcb2dfaeSJed Brown
18417be3a41SJeremy L ThompsonThe `/gpu/hip/*` backends provide GPU performance strictly using HIP.
18517be3a41SJeremy L ThompsonThey are based on the `/gpu/cuda/*` backends.
18617be3a41SJeremy L ThompsonROCm version 4.2 or newer is required.
187bcb2dfaeSJed Brown
188bcb2dfaeSJed BrownThe `/gpu/*/magma/*` backends rely upon the [MAGMA](https://bitbucket.org/icl/magma) package.
18917be3a41SJeremy L ThompsonTo enable the MAGMA backends, the environment variable `MAGMA_DIR` must point to the top-level MAGMA directory, with the MAGMA library located in `$(MAGMA_DIR)/lib/`.
19017be3a41SJeremy L ThompsonBy default, `MAGMA_DIR` is set to `../magma`; to build the MAGMA backends with a MAGMA installation located elsewhere, create a link to `magma/` in libCEED's parent directory, or set `MAGMA_DIR` to the proper location.
19117be3a41SJeremy L ThompsonMAGMA version 2.5.0 or newer is required.
19217be3a41SJeremy L ThompsonCurrently, each MAGMA library installation is only built for either CUDA or HIP.
19317be3a41SJeremy L ThompsonThe corresponding set of libCEED backends (`/gpu/cuda/magma/*` or `/gpu/hip/magma/*`) will automatically be built for the version of the MAGMA library found in `MAGMA_DIR`.
194bcb2dfaeSJed Brown
19517be3a41SJeremy L ThompsonUsers can specify a device for all CUDA, HIP, and MAGMA backends through adding `:device_id=#` after the resource name.
19617be3a41SJeremy L ThompsonFor example:
197bcb2dfaeSJed Brown
198bcb2dfaeSJed Brown> - `/gpu/cuda/gen:device_id=1`
199bcb2dfaeSJed Brown
20017be3a41SJeremy L ThompsonThe `/*/occa` backends rely upon the [OCCA](http://github.com/libocca/occa) package to provide cross platform performance.
20117be3a41SJeremy L ThompsonTo enable the OCCA backend, the environment variable `OCCA_DIR` must point to the top-level OCCA directory, with the OCCA library located in the `${OCCA_DIR}/lib` (By default, `OCCA_DIR` is set to `../occa`).
202*0be03a92SJeremy L ThompsonOCCA version 1.4.0 or newer is required.
203bcb2dfaeSJed Brown
204*0be03a92SJeremy L ThompsonUsers can pass specific OCCA device properties after setting the CEED resource.
205bcb2dfaeSJed BrownFor example:
206bcb2dfaeSJed Brown
207bcb2dfaeSJed Brown> - `"/*/occa:mode='CUDA',device_id=0"`
208bcb2dfaeSJed Brown
209bcb2dfaeSJed BrownBit-for-bit reproducibility is important in some applications.
210bcb2dfaeSJed BrownHowever, some libCEED backends use non-deterministic operations, such as `atomicAdd` for increased performance.
211bcb2dfaeSJed BrownThe backends which are capable of generating reproducible results, with the proper compilation options, are highlighted in the list above.
212bcb2dfaeSJed Brown
213bcb2dfaeSJed Brown## Examples
214bcb2dfaeSJed Brown
21517be3a41SJeremy L ThompsonlibCEED comes with several examples of its usage, ranging from standalone C codes in the `/examples/ceed` directory to examples based on external packages, such as MFEM, PETSc, and Nek5000.
21617be3a41SJeremy L ThompsonNek5000 v18.0 or greater is required.
217bcb2dfaeSJed Brown
21817be3a41SJeremy L ThompsonTo build the examples, set the `MFEM_DIR`, `PETSC_DIR`, and `NEK5K_DIR` variables and run:
219bcb2dfaeSJed Brown
220bcb2dfaeSJed Brown```
221bcb2dfaeSJed Browncd examples/
222bcb2dfaeSJed Brown```
223bcb2dfaeSJed Brown
224bcb2dfaeSJed Brown% running-examples-inclusion-marker
225bcb2dfaeSJed Brown
226bcb2dfaeSJed Brown```console
227bcb2dfaeSJed Brown# libCEED examples on CPU and GPU
228bcb2dfaeSJed Browncd ceed/
229bcb2dfaeSJed Brownmake
230bcb2dfaeSJed Brown./ex1-volume -ceed /cpu/self
231bcb2dfaeSJed Brown./ex1-volume -ceed /gpu/cuda
232bcb2dfaeSJed Brown./ex2-surface -ceed /cpu/self
233bcb2dfaeSJed Brown./ex2-surface -ceed /gpu/cuda
234bcb2dfaeSJed Browncd ..
235bcb2dfaeSJed Brown
236bcb2dfaeSJed Brown# MFEM+libCEED examples on CPU and GPU
237bcb2dfaeSJed Browncd mfem/
238bcb2dfaeSJed Brownmake
239bcb2dfaeSJed Brown./bp1 -ceed /cpu/self -no-vis
240bcb2dfaeSJed Brown./bp3 -ceed /gpu/cuda -no-vis
241bcb2dfaeSJed Browncd ..
242bcb2dfaeSJed Brown
243bcb2dfaeSJed Brown# Nek5000+libCEED examples on CPU and GPU
244bcb2dfaeSJed Browncd nek/
245bcb2dfaeSJed Brownmake
246bcb2dfaeSJed Brown./nek-examples.sh -e bp1 -ceed /cpu/self -b 3
247bcb2dfaeSJed Brown./nek-examples.sh -e bp3 -ceed /gpu/cuda -b 3
248bcb2dfaeSJed Browncd ..
249bcb2dfaeSJed Brown
250bcb2dfaeSJed Brown# PETSc+libCEED examples on CPU and GPU
251bcb2dfaeSJed Browncd petsc/
252bcb2dfaeSJed Brownmake
253bcb2dfaeSJed Brown./bps -problem bp1 -ceed /cpu/self
254bcb2dfaeSJed Brown./bps -problem bp2 -ceed /gpu/cuda
255bcb2dfaeSJed Brown./bps -problem bp3 -ceed /cpu/self
256bcb2dfaeSJed Brown./bps -problem bp4 -ceed /gpu/cuda
257bcb2dfaeSJed Brown./bps -problem bp5 -ceed /cpu/self
258bcb2dfaeSJed Brown./bps -problem bp6 -ceed /gpu/cuda
259bcb2dfaeSJed Browncd ..
260bcb2dfaeSJed Brown
261bcb2dfaeSJed Browncd petsc/
262bcb2dfaeSJed Brownmake
263bcb2dfaeSJed Brown./bpsraw -problem bp1 -ceed /cpu/self
264bcb2dfaeSJed Brown./bpsraw -problem bp2 -ceed /gpu/cuda
265bcb2dfaeSJed Brown./bpsraw -problem bp3 -ceed /cpu/self
266bcb2dfaeSJed Brown./bpsraw -problem bp4 -ceed /gpu/cuda
267bcb2dfaeSJed Brown./bpsraw -problem bp5 -ceed /cpu/self
268bcb2dfaeSJed Brown./bpsraw -problem bp6 -ceed /gpu/cuda
269bcb2dfaeSJed Browncd ..
270bcb2dfaeSJed Brown
271bcb2dfaeSJed Browncd petsc/
272bcb2dfaeSJed Brownmake
273bcb2dfaeSJed Brown./bpssphere -problem bp1 -ceed /cpu/self
274bcb2dfaeSJed Brown./bpssphere -problem bp2 -ceed /gpu/cuda
275bcb2dfaeSJed Brown./bpssphere -problem bp3 -ceed /cpu/self
276bcb2dfaeSJed Brown./bpssphere -problem bp4 -ceed /gpu/cuda
277bcb2dfaeSJed Brown./bpssphere -problem bp5 -ceed /cpu/self
278bcb2dfaeSJed Brown./bpssphere -problem bp6 -ceed /gpu/cuda
279bcb2dfaeSJed Browncd ..
280bcb2dfaeSJed Brown
281bcb2dfaeSJed Browncd petsc/
282bcb2dfaeSJed Brownmake
283bcb2dfaeSJed Brown./area -problem cube -ceed /cpu/self -degree 3
284bcb2dfaeSJed Brown./area -problem cube -ceed /gpu/cuda -degree 3
285bcb2dfaeSJed Brown./area -problem sphere -ceed /cpu/self -degree 3 -dm_refine 2
286bcb2dfaeSJed Brown./area -problem sphere -ceed /gpu/cuda -degree 3 -dm_refine 2
287bcb2dfaeSJed Brown
288bcb2dfaeSJed Browncd fluids/
289bcb2dfaeSJed Brownmake
290bcb2dfaeSJed Brown./navierstokes -ceed /cpu/self -degree 1
291bcb2dfaeSJed Brown./navierstokes -ceed /gpu/cuda -degree 1
292bcb2dfaeSJed Browncd ..
293bcb2dfaeSJed Brown
294bcb2dfaeSJed Browncd solids/
295bcb2dfaeSJed Brownmake
296bcb2dfaeSJed Brown./elasticity -ceed /cpu/self -mesh [.exo file] -degree 2 -E 1 -nu 0.3 -problem Linear -forcing mms
297bcb2dfaeSJed Brown./elasticity -ceed /gpu/cuda -mesh [.exo file] -degree 2 -E 1 -nu 0.3 -problem Linear -forcing mms
298bcb2dfaeSJed Browncd ..
299bcb2dfaeSJed Brown```
300bcb2dfaeSJed Brown
30117be3a41SJeremy L ThompsonFor the last example shown, sample meshes to be used in place of `[.exo file]` can be found at <https://github.com/jeremylt/ceedSampleMeshes>
302bcb2dfaeSJed Brown
30317be3a41SJeremy L ThompsonThe above code assumes a GPU-capable machine with the CUDA backends enabled.
30417be3a41SJeremy L ThompsonDepending on the available backends, other CEED resource specifiers can be provided with the `-ceed` option.
30517be3a41SJeremy L ThompsonOther command line arguments can be found in [examples/petsc](https://github.com/CEED/libCEED/blob/main/examples/petsc/README.md).
306bcb2dfaeSJed Brown
307bcb2dfaeSJed Brown% benchmarks-marker
308bcb2dfaeSJed Brown
309bcb2dfaeSJed Brown## Benchmarks
310bcb2dfaeSJed Brown
311bcb2dfaeSJed BrownA sequence of benchmarks for all enabled backends can be run using:
312bcb2dfaeSJed Brown
313bcb2dfaeSJed Brown```
314bcb2dfaeSJed Brownmake benchmarks
315bcb2dfaeSJed Brown```
316bcb2dfaeSJed Brown
31717be3a41SJeremy L ThompsonThe results from the benchmarks are stored inside the `benchmarks/` directory and they can be viewed using the commands (requires python with matplotlib):
318bcb2dfaeSJed Brown
319bcb2dfaeSJed Brown```
320bcb2dfaeSJed Browncd benchmarks
321bcb2dfaeSJed Brownpython postprocess-plot.py petsc-bps-bp1-*-output.txt
322bcb2dfaeSJed Brownpython postprocess-plot.py petsc-bps-bp3-*-output.txt
323bcb2dfaeSJed Brown```
324bcb2dfaeSJed Brown
32517be3a41SJeremy L ThompsonUsing the `benchmarks` target runs a comprehensive set of benchmarks which may take some time to run.
32617be3a41SJeremy L ThompsonSubsets of the benchmarks can be run using the scripts in the `benchmarks` folder.
327bcb2dfaeSJed Brown
328bcb2dfaeSJed BrownFor more details about the benchmarks, see the `benchmarks/README.md` file.
329bcb2dfaeSJed Brown
330bcb2dfaeSJed Brown## Install
331bcb2dfaeSJed Brown
332bcb2dfaeSJed BrownTo install libCEED, run:
333bcb2dfaeSJed Brown
334bcb2dfaeSJed Brown```
335d27ed4f3SJeremy L Thompsonmake install prefix=/path/to/install/dir
336bcb2dfaeSJed Brown```
337bcb2dfaeSJed Brown
338bcb2dfaeSJed Brownor (e.g., if creating packages):
339bcb2dfaeSJed Brown
340bcb2dfaeSJed Brown```
341bcb2dfaeSJed Brownmake install prefix=/usr DESTDIR=/packaging/path
342bcb2dfaeSJed Brown```
343bcb2dfaeSJed Brown
344d27ed4f3SJeremy L ThompsonTo build and install in separate steps, run:
345d27ed4f3SJeremy L Thompson
346d27ed4f3SJeremy L Thompson```
347d27ed4f3SJeremy L Thompsonmake for_install=1 prefix=/path/to/install/dir
348d27ed4f3SJeremy L Thompsonmake install prefix=/path/to/install/dir
349d27ed4f3SJeremy L Thompson```
350d27ed4f3SJeremy L Thompson
35117be3a41SJeremy L ThompsonThe usual variables like `CC` and `CFLAGS` are used, and optimization flags for all languages can be set using the likes of `OPT='-O3 -march=native'`.
35217be3a41SJeremy L ThompsonUse `STATIC=1` to build static libraries (`libceed.a`).
353bcb2dfaeSJed Brown
354bcb2dfaeSJed BrownTo install libCEED for Python, run:
355bcb2dfaeSJed Brown
356bcb2dfaeSJed Brown```
357bcb2dfaeSJed Brownpip install libceed
358bcb2dfaeSJed Brown```
359bcb2dfaeSJed Brown
360bcb2dfaeSJed Brownwith the desired setuptools options, such as `--user`.
361bcb2dfaeSJed Brown
362bcb2dfaeSJed Brown### pkg-config
363bcb2dfaeSJed Brown
36417be3a41SJeremy L ThompsonIn addition to library and header, libCEED provides a [pkg-config](https://en.wikipedia.org/wiki/Pkg-config) file that can be used to easily compile and link.
36517be3a41SJeremy L Thompson[For example](https://people.freedesktop.org/~dbn/pkg-config-guide.html#faq), if `$prefix` is a standard location or you set the environment variable `PKG_CONFIG_PATH`:
366bcb2dfaeSJed Brown
367bcb2dfaeSJed Brown```
368bcb2dfaeSJed Browncc `pkg-config --cflags --libs ceed` -o myapp myapp.c
369bcb2dfaeSJed Brown```
370bcb2dfaeSJed Brown
37117be3a41SJeremy L Thompsonwill build `myapp` with libCEED.
37217be3a41SJeremy L ThompsonThis can be used with the source or installed directories.
37317be3a41SJeremy L ThompsonMost build systems have support for pkg-config.
374bcb2dfaeSJed Brown
375bcb2dfaeSJed Brown## Contact
376bcb2dfaeSJed Brown
37717be3a41SJeremy L ThompsonYou can reach the libCEED team by emailing [ceed-users@llnl.gov](mailto:ceed-users@llnl.gov) or by leaving a comment in the [issue tracker](https://github.com/CEED/libCEED/issues).
378bcb2dfaeSJed Brown
379bcb2dfaeSJed Brown## How to Cite
380bcb2dfaeSJed Brown
381bcb2dfaeSJed BrownIf you utilize libCEED please cite:
382bcb2dfaeSJed Brown
383bcb2dfaeSJed Brown```
384bcb2dfaeSJed Brown@article{libceed-joss-paper,
385bcb2dfaeSJed Brown  author       = {Jed Brown and Ahmad Abdelfattah and Valeria Barra and Natalie Beams and Jean Sylvain Camier and Veselin Dobrev and Yohann Dudouit and Leila Ghaffari and Tzanio Kolev and David Medina and Will Pazner and Thilina Ratnayaka and Jeremy Thompson and Stan Tomov},
386bcb2dfaeSJed Brown  title        = {{libCEED}: Fast algebra for high-order element-based discretizations},
387bcb2dfaeSJed Brown  journal      = {Journal of Open Source Software},
388bcb2dfaeSJed Brown  year         = {2021},
389bcb2dfaeSJed Brown  publisher    = {The Open Journal},
390bcb2dfaeSJed Brown  volume       = {6},
391bcb2dfaeSJed Brown  number       = {63},
392bcb2dfaeSJed Brown  pages        = {2945},
393bcb2dfaeSJed Brown  doi          = {10.21105/joss.02945}
394bcb2dfaeSJed Brown}
395bcb2dfaeSJed Brown
396bcb2dfaeSJed Brown@misc{libceed-user-manual,
397bcb2dfaeSJed Brown  author       = {Abdelfattah, Ahmad and
398bcb2dfaeSJed Brown                  Barra, Valeria and
399bcb2dfaeSJed Brown                  Beams, Natalie and
400bcb2dfaeSJed Brown                  Brown, Jed and
401bcb2dfaeSJed Brown                  Camier, Jean-Sylvain and
402bcb2dfaeSJed Brown                  Dobrev, Veselin and
403bcb2dfaeSJed Brown                  Dudouit, Yohann and
404bcb2dfaeSJed Brown                  Ghaffari, Leila and
405bcb2dfaeSJed Brown                  Kolev, Tzanio and
406bcb2dfaeSJed Brown                  Medina, David and
407bcb2dfaeSJed Brown                  Pazner, Will and
408bcb2dfaeSJed Brown                  Ratnayaka, Thilina and
409bcb2dfaeSJed Brown                  Thompson, Jeremy L and
410bcb2dfaeSJed Brown                  Tomov, Stanimire},
411bcb2dfaeSJed Brown  title        = {{libCEED} User Manual},
412bcb2dfaeSJed Brown  month        = jul,
413bcb2dfaeSJed Brown  year         = 2021,
414bcb2dfaeSJed Brown  publisher    = {Zenodo},
415bcb2dfaeSJed Brown  version      = {0.9.0},
416bcb2dfaeSJed Brown  doi          = {10.5281/zenodo.5077489}
417bcb2dfaeSJed Brown}
418bcb2dfaeSJed Brown```
419bcb2dfaeSJed Brown
420bcb2dfaeSJed BrownFor libCEED's Python interface please cite:
421bcb2dfaeSJed Brown
422bcb2dfaeSJed Brown```
423bcb2dfaeSJed Brown@InProceedings{libceed-paper-proc-scipy-2020,
424bcb2dfaeSJed Brown  author    = {{V}aleria {B}arra and {J}ed {B}rown and {J}eremy {T}hompson and {Y}ohann {D}udouit},
425bcb2dfaeSJed Brown  title     = {{H}igh-performance operator evaluations with ease of use: lib{C}{E}{E}{D}'s {P}ython interface},
426bcb2dfaeSJed Brown  booktitle = {{P}roceedings of the 19th {P}ython in {S}cience {C}onference},
427bcb2dfaeSJed Brown  pages     = {85 - 90},
428bcb2dfaeSJed Brown  year      = {2020},
429bcb2dfaeSJed Brown  editor    = {{M}eghann {A}garwal and {C}hris {C}alloway and {D}illon {N}iederhut and {D}avid {S}hupe},
430bcb2dfaeSJed Brown  doi       = {10.25080/Majora-342d178e-00c}
431bcb2dfaeSJed Brown}
432bcb2dfaeSJed Brown```
433bcb2dfaeSJed Brown
43417be3a41SJeremy L ThompsonThe BiBTeX entries for these references can be found in the `doc/bib/references.bib` file.
435bcb2dfaeSJed Brown
436bcb2dfaeSJed Brown## Copyright
437bcb2dfaeSJed Brown
43817be3a41SJeremy L ThompsonThe following copyright applies to each file in the CEED software suite, unless otherwise stated in the file:
439bcb2dfaeSJed Brown
440bcb2dfaeSJed Brown> Copyright (c) 2017, Lawrence Livermore National Security, LLC. Produced at the
441bcb2dfaeSJed Brown> Lawrence Livermore National Laboratory. LLNL-CODE-734707. All Rights reserved.
442bcb2dfaeSJed Brown
443bcb2dfaeSJed BrownSee files LICENSE and NOTICE for details.
444d3fde3fbSJed Brown
445d3fde3fbSJed Brown[github-badge]: https://github.com/CEED/libCEED/workflows/C/Fortran/badge.svg
446d3fde3fbSJed Brown[github-link]: https://github.com/CEED/libCEED/actions
447d3fde3fbSJed Brown[gitlab-badge]: https://gitlab.com/libceed/libCEED/badges/main/pipeline.svg?key_text=GitLab-CI
448d3fde3fbSJed Brown[gitlab-link]: https://gitlab.com/libceed/libCEED/-/pipelines?page=1&scope=all&ref=main
449d3fde3fbSJed Brown[codecov-badge]: https://codecov.io/gh/CEED/libCEED/branch/main/graphs/badge.svg
450d3fde3fbSJed Brown[codecov-link]: https://codecov.io/gh/CEED/libCEED/
451d3fde3fbSJed Brown[license-badge]: https://img.shields.io/badge/License-BSD%202--Clause-orange.svg
452d3fde3fbSJed Brown[license-link]: https://opensource.org/licenses/BSD-2-Clause
453d3fde3fbSJed Brown[doc-badge]: https://readthedocs.org/projects/libceed/badge/?version=latest
45413964f07SJed Brown[doc-link]: https://libceed.org/en/latest/?badge=latest
455d3fde3fbSJed Brown[joss-badge]: https://joss.theoj.org/papers/10.21105/joss.02945/status.svg
456d3fde3fbSJed Brown[joss-link]: https://doi.org/10.21105/joss.02945
457d3fde3fbSJed Brown[binder-badge]: http://mybinder.org/badge_logo.svg
4581bd2483cSJeremy L Thompson[binder-link]: https://mybinder.org/v2/gh/CEED/libCEED/main?urlpath=lab/tree/examples/python/tutorial-0-ceed.ipynb
459