xref: /libCEED/README.md (revision a85b61d6baed23034bd0a67d7a68fad20fae7667)
1bcb2dfaeSJed Brown# libCEED: Efficient Extensible Discretization
2bcb2dfaeSJed Brown
3d3fde3fbSJed Brown[![GitHub Actions][github-badge]][github-link]
4d3fde3fbSJed Brown[![GitLab-CI][gitlab-badge]][gitlab-link]
5d3fde3fbSJed Brown[![Code coverage][codecov-badge]][codecov-link]
6d3fde3fbSJed Brown[![BSD-2-Clause][license-badge]][license-link]
7d3fde3fbSJed Brown[![Documentation][doc-badge]][doc-link]
8*a85b61d6SJeremy L Thompson[![User manual][zenodo-badge]][zenodo-link]
9d3fde3fbSJed Brown[![JOSS paper][joss-badge]][joss-link]
10d3fde3fbSJed Brown[![Binder][binder-badge]][binder-link]
11bcb2dfaeSJed Brown
12bcb2dfaeSJed Brown## Summary and Purpose
13bcb2dfaeSJed Brown
1417be3a41SJeremy L ThompsonlibCEED provides fast algebra for element-based discretizations, designed for performance portability, run-time flexibility, and clean embedding in higher level libraries and applications.
1517be3a41SJeremy L ThompsonIt offers a C99 interface as well as bindings for Fortran, Python, Julia, and Rust.
1617be3a41SJeremy L ThompsonWhile our focus is on high-order finite elements, the approach is mostly algebraic and thus applicable to other discretizations in factored form, as explained in the [user manual](https://libceed.org/en/latest/) and API implementation portion of the [documentation](https://libceed.org/en/latest/api/).
17bcb2dfaeSJed Brown
1817be3a41SJeremy L ThompsonOne of the challenges with high-order methods is that a global sparse matrix is no longer a good representation of a high-order linear operator, both with respect to the FLOPs needed for its evaluation, as well as the memory transfer needed for a matvec.
1917be3a41SJeremy L ThompsonThus, high-order methods require a new "format" that still represents a linear (or more generally non-linear) operator, but not through a sparse matrix.
20bcb2dfaeSJed Brown
2117be3a41SJeremy L ThompsonThe goal of libCEED is to propose such a format, as well as supporting implementations and data structures, that enable efficient operator evaluation on a variety of computational device types (CPUs, GPUs, etc.).
2217be3a41SJeremy L ThompsonThis new operator description is based on algebraically [factored form](https://libceed.org/en/latest/libCEEDapi/#finite-element-operator-decomposition), which is easy to incorporate in a wide variety of applications, without significant refactoring of their own discretization infrastructure.
23bcb2dfaeSJed Brown
2417be3a41SJeremy L ThompsonThe repository is part of the [CEED software suite](http://ceed.exascaleproject.org/software/), a collection of software benchmarks, miniapps, libraries and APIs for efficient exascale discretizations based on high-order finite element and spectral element methods.
25bcb2dfaeSJed BrownSee <http://github.com/ceed> for more information and source code availability.
26bcb2dfaeSJed Brown
2717be3a41SJeremy L ThompsonThe CEED research is supported by the [Exascale Computing Project](https://exascaleproject.org/exascale-computing-project) (17-SC-20-SC), a collaborative effort of two U.S. Department of Energy organizations (Office of Science and the National Nuclear Security Administration) responsible for the planning and preparation of a [capable exascale ecosystem](https://exascaleproject.org/what-is-exascale), including software, applications, hardware, advanced system engineering and early testbed platforms, in support of the nation’s exascale computing imperative.
28bcb2dfaeSJed Brown
2913964f07SJed BrownFor more details on the CEED API see the [user manual](https://libceed.org/en/latest/).
30bcb2dfaeSJed Brown
31bcb2dfaeSJed Brown% gettingstarted-inclusion-marker
32bcb2dfaeSJed Brown
33bcb2dfaeSJed Brown## Building
34bcb2dfaeSJed Brown
3517be3a41SJeremy L ThompsonThe CEED library, `libceed`, is a C99 library with no required dependencies, and with Fortran, Python, Julia, and Rust interfaces.
3617be3a41SJeremy L ThompsonIt can be built using:
37bcb2dfaeSJed Brown
38b648fd31SJed Brown```console
39b648fd31SJed Brown$ make
40bcb2dfaeSJed Brown```
41bcb2dfaeSJed Brown
42bcb2dfaeSJed Brownor, with optimization flags:
43bcb2dfaeSJed Brown
44b648fd31SJed Brown```console
45b648fd31SJed Brown$ make OPT='-O3 -march=skylake-avx512 -ffp-contract=fast'
46bcb2dfaeSJed Brown```
47bcb2dfaeSJed Brown
4817be3a41SJeremy L ThompsonThese optimization flags are used by all languages (C, C++, Fortran) and this makefile variable can also be set for testing and examples (below).
49bcb2dfaeSJed Brown
5017be3a41SJeremy L ThompsonThe library attempts to automatically detect support for the AVX instruction set using gcc-style compiler options for the host.
51bcb2dfaeSJed BrownSupport may need to be manually specified via:
52bcb2dfaeSJed Brown
53b648fd31SJed Brown```console
54b648fd31SJed Brown$ make AVX=1
55bcb2dfaeSJed Brown```
56bcb2dfaeSJed Brown
57bcb2dfaeSJed Brownor:
58bcb2dfaeSJed Brown
59b648fd31SJed Brown```console
60b648fd31SJed Brown$ make AVX=0
61bcb2dfaeSJed Brown```
62bcb2dfaeSJed Brown
6317be3a41SJeremy L Thompsonif your compiler does not support gcc-style options, if you are cross compiling, etc.
64bcb2dfaeSJed Brown
6517be3a41SJeremy L ThompsonTo enable CUDA support, add `CUDA_DIR=/opt/cuda` or an appropriate directory to your `make` invocation.
6617be3a41SJeremy L ThompsonTo enable HIP support, add `HIP_DIR=/opt/rocm` or an appropriate directory.
6717be3a41SJeremy L ThompsonTo store these or other arguments as defaults for future invocations of `make`, use:
68bcb2dfaeSJed Brown
69b648fd31SJed Brown```console
70b648fd31SJed Brown$ make configure CUDA_DIR=/usr/local/cuda HIP_DIR=/opt/rocm OPT='-O3 -march=znver2'
71bcb2dfaeSJed Brown```
72bcb2dfaeSJed Brown
73bcb2dfaeSJed Brownwhich stores these variables in `config.mk`.
74bcb2dfaeSJed Brown
75b648fd31SJed Brown### WebAssembly
76b648fd31SJed Brown
77b648fd31SJed BrownlibCEED can be built for WASM using [Emscripten](https://emscripten.org). For example, one can build the library and run a standalone WASM executable using
78b648fd31SJed Brown
79b648fd31SJed Brown``` console
80b648fd31SJed Brown$ emmake make build/ex2-surface.wasm
81b648fd31SJed Brown$ wasmer build/ex2-surface.wasm -- -s 200000
82b648fd31SJed Brown```
83b648fd31SJed Brown
84bcb2dfaeSJed Brown## Additional Language Interfaces
85bcb2dfaeSJed Brown
86bcb2dfaeSJed BrownThe Fortran interface is built alongside the library automatically.
87bcb2dfaeSJed Brown
88bcb2dfaeSJed BrownPython users can install using:
89bcb2dfaeSJed Brown
90b648fd31SJed Brown```console
91b648fd31SJed Brown$ pip install libceed
92bcb2dfaeSJed Brown```
93bcb2dfaeSJed Brown
94bcb2dfaeSJed Brownor in a clone of the repository via `pip install .`.
95bcb2dfaeSJed Brown
96bcb2dfaeSJed BrownJulia users can install using:
97bcb2dfaeSJed Brown
98b648fd31SJed Brown```console
99bcb2dfaeSJed Brown$ julia
100bcb2dfaeSJed Brownjulia> ]
101bcb2dfaeSJed Brownpkg> add LibCEED
102bcb2dfaeSJed Brown```
103bcb2dfaeSJed Brown
10417be3a41SJeremy L ThompsonSee the [LibCEED.jl documentation](http://ceed.exascaleproject.org/libCEED-julia-docs/dev/) for more information.
105bcb2dfaeSJed Brown
106bcb2dfaeSJed BrownRust users can include libCEED via `Cargo.toml`:
107bcb2dfaeSJed Brown
108bcb2dfaeSJed Brown```toml
109bcb2dfaeSJed Brown[dependencies]
1108ec64e9aSJed Brownlibceed = "0.11.0"
111bcb2dfaeSJed Brown```
112bcb2dfaeSJed Brown
113bcb2dfaeSJed BrownSee the [Cargo documentation](https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#specifying-dependencies-from-git-repositories) for details.
114bcb2dfaeSJed Brown
115bcb2dfaeSJed Brown## Testing
116bcb2dfaeSJed Brown
117bcb2dfaeSJed BrownThe test suite produces [TAP](https://testanything.org) output and is run by:
118bcb2dfaeSJed Brown
119f11332b8SJeremy L Thompson```console
120f11332b8SJeremy L Thompson$ make test
121bcb2dfaeSJed Brown```
122bcb2dfaeSJed Brown
123bcb2dfaeSJed Brownor, using the `prove` tool distributed with Perl (recommended):
124bcb2dfaeSJed Brown
125f11332b8SJeremy L Thompson```console
126f11332b8SJeremy L Thompson$ make prove
127bcb2dfaeSJed Brown```
128bcb2dfaeSJed Brown
129bcb2dfaeSJed Brown## Backends
130bcb2dfaeSJed Brown
131bcb2dfaeSJed BrownThere are multiple supported backends, which can be selected at runtime in the examples:
132bcb2dfaeSJed Brown
133bcb2dfaeSJed Brown| CEED resource              | Backend                                           | Deterministic Capable |
134d3fde3fbSJed Brown| :---                       | :---                                              | :---:                 |
135d3fde3fbSJed Brown||
136d3fde3fbSJed Brown| **CPU Native**             |
137d3fde3fbSJed Brown| `/cpu/self/ref/serial`     | Serial reference implementation                   | Yes                   |
138d3fde3fbSJed Brown| `/cpu/self/ref/blocked`    | Blocked reference implementation                  | Yes                   |
139d3fde3fbSJed Brown| `/cpu/self/opt/serial`     | Serial optimized C implementation                 | Yes                   |
140d3fde3fbSJed Brown| `/cpu/self/opt/blocked`    | Blocked optimized C implementation                | Yes                   |
141d3fde3fbSJed Brown| `/cpu/self/avx/serial`     | Serial AVX implementation                         | Yes                   |
142d3fde3fbSJed Brown| `/cpu/self/avx/blocked`    | Blocked AVX implementation                        | Yes                   |
143d3fde3fbSJed Brown||
144d3fde3fbSJed Brown| **CPU Valgrind**           |
145d3fde3fbSJed Brown| `/cpu/self/memcheck/*`     | Memcheck backends, undefined value checks         | Yes                   |
146d3fde3fbSJed Brown||
147d3fde3fbSJed Brown| **CPU LIBXSMM**            |
148d3fde3fbSJed Brown| `/cpu/self/xsmm/serial`    | Serial LIBXSMM implementation                     | Yes                   |
149d3fde3fbSJed Brown| `/cpu/self/xsmm/blocked`   | Blocked LIBXSMM implementation                    | Yes                   |
150d3fde3fbSJed Brown||
151d3fde3fbSJed Brown| **CUDA Native**            |
152d3fde3fbSJed Brown| `/gpu/cuda/ref`            | Reference pure CUDA kernels                       | Yes                   |
153d3fde3fbSJed Brown| `/gpu/cuda/shared`         | Optimized pure CUDA kernels using shared memory   | Yes                   |
154d3fde3fbSJed Brown| `/gpu/cuda/gen`            | Optimized pure CUDA kernels using code generation | No                    |
155d3fde3fbSJed Brown||
156d3fde3fbSJed Brown| **HIP Native**             |
157d3fde3fbSJed Brown| `/gpu/hip/ref`             | Reference pure HIP kernels                        | Yes                   |
158d3fde3fbSJed Brown| `/gpu/hip/shared`          | Optimized pure HIP kernels using shared memory    | Yes                   |
159d3fde3fbSJed Brown| `/gpu/hip/gen`             | Optimized pure HIP kernels using code generation  | No                    |
160d3fde3fbSJed Brown||
161d3fde3fbSJed Brown| **MAGMA**                  |
162d3fde3fbSJed Brown| `/gpu/cuda/magma`          | CUDA MAGMA kernels                                | No                    |
163d3fde3fbSJed Brown| `/gpu/cuda/magma/det`      | CUDA MAGMA kernels                                | Yes                   |
164d3fde3fbSJed Brown| `/gpu/hip/magma`           | HIP MAGMA kernels                                 | No                    |
165d3fde3fbSJed Brown| `/gpu/hip/magma/det`       | HIP MAGMA kernels                                 | Yes                   |
166d3fde3fbSJed Brown||
167d3fde3fbSJed Brown| **OCCA**                   |
168d3fde3fbSJed Brown| `/*/occa`                  | Selects backend based on available OCCA modes     | Yes                   |
169d3fde3fbSJed Brown| `/cpu/self/occa`           | OCCA backend with serial CPU kernels              | Yes                   |
170d3fde3fbSJed Brown| `/cpu/openmp/occa`         | OCCA backend with OpenMP kernels                  | Yes                   |
1710be03a92SJeremy L Thompson| `/cpu/dpcpp/occa`          | OCCA backend with CPC++ kernels                   | Yes                   |
172d3fde3fbSJed Brown| `/gpu/cuda/occa`           | OCCA backend with CUDA kernels                    | Yes                   |
173d3fde3fbSJed Brown| `/gpu/hip/occa`~           | OCCA backend with HIP kernels                     | Yes                   |
174bcb2dfaeSJed Brown
17517be3a41SJeremy L ThompsonThe `/cpu/self/*/serial` backends process one element at a time and are intended for meshes with a smaller number of high order elements.
17617be3a41SJeremy L ThompsonThe `/cpu/self/*/blocked` backends process blocked batches of eight interlaced elements and are intended for meshes with higher numbers of elements.
177bcb2dfaeSJed Brown
178bcb2dfaeSJed BrownThe `/cpu/self/ref/*` backends are written in pure C and provide basic functionality.
179bcb2dfaeSJed Brown
180bcb2dfaeSJed BrownThe `/cpu/self/opt/*` backends are written in pure C and use partial e-vectors to improve performance.
181bcb2dfaeSJed Brown
182bcb2dfaeSJed BrownThe `/cpu/self/avx/*` backends rely upon AVX instructions to provide vectorized CPU performance.
183bcb2dfaeSJed Brown
18417be3a41SJeremy L ThompsonThe `/cpu/self/memcheck/*` backends rely upon the [Valgrind](http://valgrind.org/) Memcheck tool to help verify that user QFunctions have no undefined values.
18517be3a41SJeremy L ThompsonTo use, run your code with Valgrind and the Memcheck backends, e.g. `valgrind ./build/ex1 -ceed /cpu/self/ref/memcheck`.
18617be3a41SJeremy L ThompsonA 'development' or 'debugging' version of Valgrind with headers is required to use this backend.
18717be3a41SJeremy L ThompsonThis backend can be run in serial or blocked mode and defaults to running in the serial mode if `/cpu/self/memcheck` is selected at runtime.
188bcb2dfaeSJed Brown
18917be3a41SJeremy L ThompsonThe `/cpu/self/xsmm/*` backends rely upon the [LIBXSMM](http://github.com/hfp/libxsmm) package to provide vectorized CPU performance.
19017be3a41SJeremy L ThompsonIf linking MKL and LIBXSMM is desired but the Makefile is not detecting `MKLROOT`, linking libCEED against MKL can be forced by setting the environment variable `MKL=1`.
191bcb2dfaeSJed Brown
192bcb2dfaeSJed BrownThe `/gpu/cuda/*` backends provide GPU performance strictly using CUDA.
193bcb2dfaeSJed Brown
19417be3a41SJeremy L ThompsonThe `/gpu/hip/*` backends provide GPU performance strictly using HIP.
19517be3a41SJeremy L ThompsonThey are based on the `/gpu/cuda/*` backends.
19617be3a41SJeremy L ThompsonROCm version 4.2 or newer is required.
197bcb2dfaeSJed Brown
198bcb2dfaeSJed BrownThe `/gpu/*/magma/*` backends rely upon the [MAGMA](https://bitbucket.org/icl/magma) package.
19917be3a41SJeremy L ThompsonTo enable the MAGMA backends, the environment variable `MAGMA_DIR` must point to the top-level MAGMA directory, with the MAGMA library located in `$(MAGMA_DIR)/lib/`.
20017be3a41SJeremy L ThompsonBy default, `MAGMA_DIR` is set to `../magma`; to build the MAGMA backends with a MAGMA installation located elsewhere, create a link to `magma/` in libCEED's parent directory, or set `MAGMA_DIR` to the proper location.
20117be3a41SJeremy L ThompsonMAGMA version 2.5.0 or newer is required.
20217be3a41SJeremy L ThompsonCurrently, each MAGMA library installation is only built for either CUDA or HIP.
20317be3a41SJeremy L ThompsonThe corresponding set of libCEED backends (`/gpu/cuda/magma/*` or `/gpu/hip/magma/*`) will automatically be built for the version of the MAGMA library found in `MAGMA_DIR`.
204bcb2dfaeSJed Brown
20517be3a41SJeremy L ThompsonUsers can specify a device for all CUDA, HIP, and MAGMA backends through adding `:device_id=#` after the resource name.
20617be3a41SJeremy L ThompsonFor example:
207bcb2dfaeSJed Brown
208bcb2dfaeSJed Brown> - `/gpu/cuda/gen:device_id=1`
209bcb2dfaeSJed Brown
21017be3a41SJeremy L ThompsonThe `/*/occa` backends rely upon the [OCCA](http://github.com/libocca/occa) package to provide cross platform performance.
21117be3a41SJeremy L ThompsonTo enable the OCCA backend, the environment variable `OCCA_DIR` must point to the top-level OCCA directory, with the OCCA library located in the `${OCCA_DIR}/lib` (By default, `OCCA_DIR` is set to `../occa`).
2120be03a92SJeremy L ThompsonOCCA version 1.4.0 or newer is required.
213bcb2dfaeSJed Brown
2140be03a92SJeremy L ThompsonUsers can pass specific OCCA device properties after setting the CEED resource.
215bcb2dfaeSJed BrownFor example:
216bcb2dfaeSJed Brown
217bcb2dfaeSJed Brown> - `"/*/occa:mode='CUDA',device_id=0"`
218bcb2dfaeSJed Brown
219bcb2dfaeSJed BrownBit-for-bit reproducibility is important in some applications.
220bcb2dfaeSJed BrownHowever, some libCEED backends use non-deterministic operations, such as `atomicAdd` for increased performance.
221bcb2dfaeSJed BrownThe backends which are capable of generating reproducible results, with the proper compilation options, are highlighted in the list above.
222bcb2dfaeSJed Brown
223bcb2dfaeSJed Brown## Examples
224bcb2dfaeSJed Brown
22517be3a41SJeremy L ThompsonlibCEED comes with several examples of its usage, ranging from standalone C codes in the `/examples/ceed` directory to examples based on external packages, such as MFEM, PETSc, and Nek5000.
22617be3a41SJeremy L ThompsonNek5000 v18.0 or greater is required.
227bcb2dfaeSJed Brown
22817be3a41SJeremy L ThompsonTo build the examples, set the `MFEM_DIR`, `PETSC_DIR`, and `NEK5K_DIR` variables and run:
229bcb2dfaeSJed Brown
230b648fd31SJed Brown```console
231b648fd31SJed Brown$ cd examples/
232bcb2dfaeSJed Brown```
233bcb2dfaeSJed Brown
234bcb2dfaeSJed Brown% running-examples-inclusion-marker
235bcb2dfaeSJed Brown
236bcb2dfaeSJed Brown```console
237bcb2dfaeSJed Brown# libCEED examples on CPU and GPU
238b648fd31SJed Brown$ cd ceed/
239b648fd31SJed Brown$ make
240b648fd31SJed Brown$ ./ex1-volume -ceed /cpu/self
241b648fd31SJed Brown$ ./ex1-volume -ceed /gpu/cuda
242b648fd31SJed Brown$ ./ex2-surface -ceed /cpu/self
243b648fd31SJed Brown$ ./ex2-surface -ceed /gpu/cuda
244b648fd31SJed Brown$ cd ..
245bcb2dfaeSJed Brown
246bcb2dfaeSJed Brown# MFEM+libCEED examples on CPU and GPU
247b648fd31SJed Brown$ cd mfem/
248b648fd31SJed Brown$ make
249b648fd31SJed Brown$ ./bp1 -ceed /cpu/self -no-vis
250b648fd31SJed Brown$ ./bp3 -ceed /gpu/cuda -no-vis
251b648fd31SJed Brown$ cd ..
252bcb2dfaeSJed Brown
253bcb2dfaeSJed Brown# Nek5000+libCEED examples on CPU and GPU
254b648fd31SJed Brown$ cd nek/
255b648fd31SJed Brown$ make
256b648fd31SJed Brown$ ./nek-examples.sh -e bp1 -ceed /cpu/self -b 3
257b648fd31SJed Brown$ ./nek-examples.sh -e bp3 -ceed /gpu/cuda -b 3
258b648fd31SJed Brown$ cd ..
259bcb2dfaeSJed Brown
260bcb2dfaeSJed Brown# PETSc+libCEED examples on CPU and GPU
261b648fd31SJed Brown$ cd petsc/
262b648fd31SJed Brown$ make
263b648fd31SJed Brown$ ./bps -problem bp1 -ceed /cpu/self
264b648fd31SJed Brown$ ./bps -problem bp2 -ceed /gpu/cuda
265b648fd31SJed Brown$ ./bps -problem bp3 -ceed /cpu/self
266b648fd31SJed Brown$ ./bps -problem bp4 -ceed /gpu/cuda
267b648fd31SJed Brown$ ./bps -problem bp5 -ceed /cpu/self
268b648fd31SJed Brown$ ./bps -problem bp6 -ceed /gpu/cuda
269b648fd31SJed Brown$ cd ..
270bcb2dfaeSJed Brown
271b648fd31SJed Brown$ cd petsc/
272b648fd31SJed Brown$ make
273b648fd31SJed Brown$ ./bpsraw -problem bp1 -ceed /cpu/self
274b648fd31SJed Brown$ ./bpsraw -problem bp2 -ceed /gpu/cuda
275b648fd31SJed Brown$ ./bpsraw -problem bp3 -ceed /cpu/self
276b648fd31SJed Brown$ ./bpsraw -problem bp4 -ceed /gpu/cuda
277b648fd31SJed Brown$ ./bpsraw -problem bp5 -ceed /cpu/self
278b648fd31SJed Brown$ ./bpsraw -problem bp6 -ceed /gpu/cuda
279b648fd31SJed Brown$ cd ..
280bcb2dfaeSJed Brown
281b648fd31SJed Brown$ cd petsc/
282b648fd31SJed Brown$ make
283b648fd31SJed Brown$ ./bpssphere -problem bp1 -ceed /cpu/self
284b648fd31SJed Brown$ ./bpssphere -problem bp2 -ceed /gpu/cuda
285b648fd31SJed Brown$ ./bpssphere -problem bp3 -ceed /cpu/self
286b648fd31SJed Brown$ ./bpssphere -problem bp4 -ceed /gpu/cuda
287b648fd31SJed Brown$ ./bpssphere -problem bp5 -ceed /cpu/self
288b648fd31SJed Brown$ ./bpssphere -problem bp6 -ceed /gpu/cuda
289b648fd31SJed Brown$ cd ..
290bcb2dfaeSJed Brown
291b648fd31SJed Brown$ cd petsc/
292b648fd31SJed Brown$ make
293b648fd31SJed Brown$ ./area -problem cube -ceed /cpu/self -degree 3
294b648fd31SJed Brown$ ./area -problem cube -ceed /gpu/cuda -degree 3
295b648fd31SJed Brown$ ./area -problem sphere -ceed /cpu/self -degree 3 -dm_refine 2
296b648fd31SJed Brown$ ./area -problem sphere -ceed /gpu/cuda -degree 3 -dm_refine 2
297bcb2dfaeSJed Brown
298b648fd31SJed Brown$ cd fluids/
299b648fd31SJed Brown$ make
300b648fd31SJed Brown$ ./navierstokes -ceed /cpu/self -degree 1
301b648fd31SJed Brown$ ./navierstokes -ceed /gpu/cuda -degree 1
302b648fd31SJed Brown$ cd ..
303bcb2dfaeSJed Brown
304b648fd31SJed Brown$ cd solids/
305b648fd31SJed Brown$ make
306b648fd31SJed Brown$ ./elasticity -ceed /cpu/self -mesh [.exo file] -degree 2 -E 1 -nu 0.3 -problem Linear -forcing mms
307b648fd31SJed Brown$ ./elasticity -ceed /gpu/cuda -mesh [.exo file] -degree 2 -E 1 -nu 0.3 -problem Linear -forcing mms
308b648fd31SJed Brown$ cd ..
309bcb2dfaeSJed Brown```
310bcb2dfaeSJed Brown
31117be3a41SJeremy L ThompsonFor the last example shown, sample meshes to be used in place of `[.exo file]` can be found at <https://github.com/jeremylt/ceedSampleMeshes>
312bcb2dfaeSJed Brown
31317be3a41SJeremy L ThompsonThe above code assumes a GPU-capable machine with the CUDA backends enabled.
31417be3a41SJeremy L ThompsonDepending on the available backends, other CEED resource specifiers can be provided with the `-ceed` option.
31517be3a41SJeremy L ThompsonOther command line arguments can be found in [examples/petsc](https://github.com/CEED/libCEED/blob/main/examples/petsc/README.md).
316bcb2dfaeSJed Brown
317bcb2dfaeSJed Brown% benchmarks-marker
318bcb2dfaeSJed Brown
319bcb2dfaeSJed Brown## Benchmarks
320bcb2dfaeSJed Brown
321bcb2dfaeSJed BrownA sequence of benchmarks for all enabled backends can be run using:
322bcb2dfaeSJed Brown
323b648fd31SJed Brown```console
324b648fd31SJed Brown$ make benchmarks
325bcb2dfaeSJed Brown```
326bcb2dfaeSJed Brown
32717be3a41SJeremy L ThompsonThe results from the benchmarks are stored inside the `benchmarks/` directory and they can be viewed using the commands (requires python with matplotlib):
328bcb2dfaeSJed Brown
329b648fd31SJed Brown```console
330b648fd31SJed Brown$ cd benchmarks
331b648fd31SJed Brown$ python postprocess-plot.py petsc-bps-bp1-*-output.txt
332b648fd31SJed Brown$ python postprocess-plot.py petsc-bps-bp3-*-output.txt
333bcb2dfaeSJed Brown```
334bcb2dfaeSJed Brown
33517be3a41SJeremy L ThompsonUsing the `benchmarks` target runs a comprehensive set of benchmarks which may take some time to run.
33617be3a41SJeremy L ThompsonSubsets of the benchmarks can be run using the scripts in the `benchmarks` folder.
337bcb2dfaeSJed Brown
338bcb2dfaeSJed BrownFor more details about the benchmarks, see the `benchmarks/README.md` file.
339bcb2dfaeSJed Brown
340bcb2dfaeSJed Brown## Install
341bcb2dfaeSJed Brown
342bcb2dfaeSJed BrownTo install libCEED, run:
343bcb2dfaeSJed Brown
344b648fd31SJed Brown```console
345b648fd31SJed Brown$ make install prefix=/path/to/install/dir
346bcb2dfaeSJed Brown```
347bcb2dfaeSJed Brown
348bcb2dfaeSJed Brownor (e.g., if creating packages):
349bcb2dfaeSJed Brown
350b648fd31SJed Brown```console
351b648fd31SJed Brown$ make install prefix=/usr DESTDIR=/packaging/path
352bcb2dfaeSJed Brown```
353bcb2dfaeSJed Brown
354d27ed4f3SJeremy L ThompsonTo build and install in separate steps, run:
355d27ed4f3SJeremy L Thompson
356b648fd31SJed Brown```console
357b648fd31SJed Brown$ make for_install=1 prefix=/path/to/install/dir
358b648fd31SJed Brown$ make install prefix=/path/to/install/dir
359d27ed4f3SJeremy L Thompson```
360d27ed4f3SJeremy L Thompson
36117be3a41SJeremy L ThompsonThe usual variables like `CC` and `CFLAGS` are used, and optimization flags for all languages can be set using the likes of `OPT='-O3 -march=native'`.
36217be3a41SJeremy L ThompsonUse `STATIC=1` to build static libraries (`libceed.a`).
363bcb2dfaeSJed Brown
364bcb2dfaeSJed BrownTo install libCEED for Python, run:
365bcb2dfaeSJed Brown
366b648fd31SJed Brown```console
367b648fd31SJed Brown$ pip install libceed
368bcb2dfaeSJed Brown```
369bcb2dfaeSJed Brown
370bcb2dfaeSJed Brownwith the desired setuptools options, such as `--user`.
371bcb2dfaeSJed Brown
372bcb2dfaeSJed Brown### pkg-config
373bcb2dfaeSJed Brown
37417be3a41SJeremy L ThompsonIn addition to library and header, libCEED provides a [pkg-config](https://en.wikipedia.org/wiki/Pkg-config) file that can be used to easily compile and link.
37517be3a41SJeremy L Thompson[For example](https://people.freedesktop.org/~dbn/pkg-config-guide.html#faq), if `$prefix` is a standard location or you set the environment variable `PKG_CONFIG_PATH`:
376bcb2dfaeSJed Brown
377b648fd31SJed Brown```console
378b648fd31SJed Brown$ cc `pkg-config --cflags --libs ceed` -o myapp myapp.c
379bcb2dfaeSJed Brown```
380bcb2dfaeSJed Brown
38117be3a41SJeremy L Thompsonwill build `myapp` with libCEED.
38217be3a41SJeremy L ThompsonThis can be used with the source or installed directories.
38317be3a41SJeremy L ThompsonMost build systems have support for pkg-config.
384bcb2dfaeSJed Brown
385bcb2dfaeSJed Brown## Contact
386bcb2dfaeSJed Brown
38717be3a41SJeremy L ThompsonYou can reach the libCEED team by emailing [ceed-users@llnl.gov](mailto:ceed-users@llnl.gov) or by leaving a comment in the [issue tracker](https://github.com/CEED/libCEED/issues).
388bcb2dfaeSJed Brown
389bcb2dfaeSJed Brown## How to Cite
390bcb2dfaeSJed Brown
391bcb2dfaeSJed BrownIf you utilize libCEED please cite:
392bcb2dfaeSJed Brown
393b648fd31SJed Brown```bibtex
394bcb2dfaeSJed Brown@article{libceed-joss-paper,
395bcb2dfaeSJed Brown  author       = {Jed Brown and Ahmad Abdelfattah and Valeria Barra and Natalie Beams and Jean Sylvain Camier and Veselin Dobrev and Yohann Dudouit and Leila Ghaffari and Tzanio Kolev and David Medina and Will Pazner and Thilina Ratnayaka and Jeremy Thompson and Stan Tomov},
396bcb2dfaeSJed Brown  title        = {{libCEED}: Fast algebra for high-order element-based discretizations},
397bcb2dfaeSJed Brown  journal      = {Journal of Open Source Software},
398bcb2dfaeSJed Brown  year         = {2021},
399bcb2dfaeSJed Brown  publisher    = {The Open Journal},
400bcb2dfaeSJed Brown  volume       = {6},
401bcb2dfaeSJed Brown  number       = {63},
402bcb2dfaeSJed Brown  pages        = {2945},
403bcb2dfaeSJed Brown  doi          = {10.21105/joss.02945}
404bcb2dfaeSJed Brown}
405bcb2dfaeSJed Brown
406bcb2dfaeSJed Brown@misc{libceed-user-manual,
407bcb2dfaeSJed Brown  author       = {Abdelfattah, Ahmad and
408bcb2dfaeSJed Brown                  Barra, Valeria and
409bcb2dfaeSJed Brown                  Beams, Natalie and
410bcb2dfaeSJed Brown                  Brown, Jed and
411bcb2dfaeSJed Brown                  Camier, Jean-Sylvain and
412bcb2dfaeSJed Brown                  Dobrev, Veselin and
413bcb2dfaeSJed Brown                  Dudouit, Yohann and
414bcb2dfaeSJed Brown                  Ghaffari, Leila and
415bcb2dfaeSJed Brown                  Kolev, Tzanio and
416bcb2dfaeSJed Brown                  Medina, David and
417bcb2dfaeSJed Brown                  Pazner, Will and
418bcb2dfaeSJed Brown                  Ratnayaka, Thilina and
419*a85b61d6SJeremy L Thompson                  Shakeri, Rezgar and
420bcb2dfaeSJed Brown                  Thompson, Jeremy L and
421*a85b61d6SJeremy L Thompson                  Tomov, Stanimire and
422*a85b61d6SJeremy L Thompson                  Wright III, James},
423bcb2dfaeSJed Brown  title        = {{libCEED} User Manual},
424*a85b61d6SJeremy L Thompson  month        = dec,
425*a85b61d6SJeremy L Thompson  year         = 2022,
426bcb2dfaeSJed Brown  publisher    = {Zenodo},
427*a85b61d6SJeremy L Thompson  version      = {0.11.0},
428*a85b61d6SJeremy L Thompson  doi          = {10.5281/zenodo.7480454}
429bcb2dfaeSJed Brown}
430bcb2dfaeSJed Brown```
431bcb2dfaeSJed Brown
432bcb2dfaeSJed BrownFor libCEED's Python interface please cite:
433bcb2dfaeSJed Brown
434b648fd31SJed Brown```bibtex
435bcb2dfaeSJed Brown@InProceedings{libceed-paper-proc-scipy-2020,
436bcb2dfaeSJed Brown  author    = {{V}aleria {B}arra and {J}ed {B}rown and {J}eremy {T}hompson and {Y}ohann {D}udouit},
437bcb2dfaeSJed Brown  title     = {{H}igh-performance operator evaluations with ease of use: lib{C}{E}{E}{D}'s {P}ython interface},
438bcb2dfaeSJed Brown  booktitle = {{P}roceedings of the 19th {P}ython in {S}cience {C}onference},
439bcb2dfaeSJed Brown  pages     = {85 - 90},
440bcb2dfaeSJed Brown  year      = {2020},
441bcb2dfaeSJed Brown  editor    = {{M}eghann {A}garwal and {C}hris {C}alloway and {D}illon {N}iederhut and {D}avid {S}hupe},
442bcb2dfaeSJed Brown  doi       = {10.25080/Majora-342d178e-00c}
443bcb2dfaeSJed Brown}
444bcb2dfaeSJed Brown```
445bcb2dfaeSJed Brown
446b648fd31SJed BrownThe BibTeX entries for these references can be found in the `doc/bib/references.bib` file.
447bcb2dfaeSJed Brown
448bcb2dfaeSJed Brown## Copyright
449bcb2dfaeSJed Brown
45017be3a41SJeremy L ThompsonThe following copyright applies to each file in the CEED software suite, unless otherwise stated in the file:
451bcb2dfaeSJed Brown
452bcb2dfaeSJed Brown> Copyright (c) 2017, Lawrence Livermore National Security, LLC. Produced at the
453bcb2dfaeSJed Brown> Lawrence Livermore National Laboratory. LLNL-CODE-734707. All Rights reserved.
454bcb2dfaeSJed Brown
455bcb2dfaeSJed BrownSee files LICENSE and NOTICE for details.
456d3fde3fbSJed Brown
457d3fde3fbSJed Brown[github-badge]: https://github.com/CEED/libCEED/workflows/C/Fortran/badge.svg
458d3fde3fbSJed Brown[github-link]: https://github.com/CEED/libCEED/actions
459d3fde3fbSJed Brown[gitlab-badge]: https://gitlab.com/libceed/libCEED/badges/main/pipeline.svg?key_text=GitLab-CI
460d3fde3fbSJed Brown[gitlab-link]: https://gitlab.com/libceed/libCEED/-/pipelines?page=1&scope=all&ref=main
461d3fde3fbSJed Brown[codecov-badge]: https://codecov.io/gh/CEED/libCEED/branch/main/graphs/badge.svg
462d3fde3fbSJed Brown[codecov-link]: https://codecov.io/gh/CEED/libCEED/
463d3fde3fbSJed Brown[license-badge]: https://img.shields.io/badge/License-BSD%202--Clause-orange.svg
464d3fde3fbSJed Brown[license-link]: https://opensource.org/licenses/BSD-2-Clause
465d3fde3fbSJed Brown[doc-badge]: https://readthedocs.org/projects/libceed/badge/?version=latest
46613964f07SJed Brown[doc-link]: https://libceed.org/en/latest/?badge=latest
467d3fde3fbSJed Brown[joss-badge]: https://joss.theoj.org/papers/10.21105/joss.02945/status.svg
468d3fde3fbSJed Brown[joss-link]: https://doi.org/10.21105/joss.02945
469d3fde3fbSJed Brown[binder-badge]: http://mybinder.org/badge_logo.svg
4701bd2483cSJeremy L Thompson[binder-link]: https://mybinder.org/v2/gh/CEED/libCEED/main?urlpath=lab/tree/examples/python/tutorial-0-ceed.ipynb
471*a85b61d6SJeremy L Thompson[zenodo-badge]: https://zenodo.org/badge/DOI/10.5281/zenodo.svg
472*a85b61d6SJeremy L Thompson[zenodo-link]: https://doi.org/10.5281/zenodo.4302736
473