xref: /libCEED/doc/sphinx/source/gettingstarted.md (revision da62e0a989e257e069a5e7b687e03d61a99b685b)
1# Getting Started
2
3## Building
4
5The CEED library, `libceed`, is a C99 library with no external dependencies. The library
6has Fortran and Python interfaces; see `interface/ceed-fortran.c` and
7`interface/ceed-python/`. It can be built using
8
9    make
10
11or, with optimization flags
12
13    make OPT='-O3 -march=skylake-avx512 -ffp-contract=fast'
14
15These optimization flags are used by all languages (C, C++, Fortran) and this
16makefile variable can also be set for testing and examples (below).
17
18The library attempts to automatically detect support for the AVX
19instruction set using gcc-style compiler options for the host.
20Support may need to be manually specified via
21
22    make AVX=1
23
24or
25
26    make AVX=0
27
28if your compiler does not support gcc-style options, if you are cross
29compiling, etc.
30
31## Testing
32
33The test suite produces [TAP](https://testanything.org) output and is run by:
34
35    make test
36
37or, using the `prove` tool distributed with Perl (recommended)
38
39    make prove
40
41## Backends
42
43There are multiple supported backends, which can be selected at runtime in the examples:
44
45| CEED resource            | Backend                                           |
46| :----------------------- | :------------------------------------------------ |
47| `/cpu/self/ref/serial`   | Serial reference implementation                   |
48| `/cpu/self/ref/blocked`  | Blocked refrence implementation                   |
49| `/cpu/self/memcheck`     | Memcheck backend, undefined value checks          |
50| `/cpu/self/opt/serial`   | Serial optimized C implementation                 |
51| `/cpu/self/opt/blocked`  | Blocked optimized C implementation                |
52| `/cpu/self/avx/serial`   | Serial AVX implementation                         |
53| `/cpu/self/avx/blocked`  | Blocked AVX implementation                        |
54| `/cpu/self/xsmm/serial`  | Serial LIBXSMM implementation                     |
55| `/cpu/self/xsmm/blocked` | Blocked LIBXSMM implementation                    |
56| `/cpu/occa`              | Serial OCCA kernels                               |
57| `/gpu/occa`              | CUDA OCCA kernels                                 |
58| `/omp/occa`              | OpenMP OCCA kernels                               |
59| `/ocl/occa`              | OpenCL OCCA kernels                               |
60| `/gpu/cuda/ref`          | Reference pure CUDA kernels                       |
61| `/gpu/cuda/reg`          | Pure CUDA kernels using one thread per element    |
62| `/gpu/cuda/shared`       | Optimized pure CUDA kernels using shared memory   |
63| `/gpu/cuda/gen`          | Optimized pure CUDA kernels using code generation |
64| `/gpu/magma`             | CUDA MAGMA kernels                                |
65
66The `/cpu/self/*/serial` backends process one element at a time and are intended for meshes
67with a smaller number of high order elements. The `/cpu/self/*/blocked` backends process
68blocked batches of eight interlaced elements and are intended for meshes with higher numbers
69of elements.
70
71The `/cpu/self/ref/*` backends are written in pure C and provide basic functionality.
72
73The `/cpu/self/opt/*` backends are written in pure C and use partial e-vectors to improve performance.
74
75The `/cpu/self/avx/*` backends rely upon AVX instructions to provide vectorized CPU performance.
76
77The `/cpu/self/xsmm/*` backends rely upon the [LIBXSMM](http://github.com/hfp/libxsmm) package
78to provide vectorized CPU performance. If linking MKL and LIBXSMM is desired but
79the Makefile is not detecting `MKLROOT`, linking libCEED against MKL can be
80forced by setting the environment variable `MKL=1`.
81
82The `/cpu/self/memcheck/*` backends rely upon the [Valgrind](http://valgrind.org/) Memcheck tool
83to help verify that user QFunctions have no undefined values. To use, run your code with
84Valgrind and the Memcheck backends, e.g. `valgrind ./build/ex1 -ceed /cpu/self/ref/memcheck`. A
85'development' or 'debugging' version of Valgrind with headers is required to use this backend.
86This backend can be run in serial or blocked mode and defaults to running in the serial mode
87if `/cpu/self/memcheck` is selected at runtime.
88
89The `/*/occa` backends rely upon the [OCCA](http://github.com/libocca/occa) package to provide
90cross platform performance.
91
92The `/gpu/cuda/*` backends provide GPU performance strictly using CUDA.
93
94The `/gpu/magma` backend relies upon the [MAGMA](https://bitbucket.org/icl/magma) package.
95
96## Examples
97
98libCEED comes with several examples of its usage, ranging from standalone C
99codes in the `/examples/ceed` directory to examples based on external packages,
100such as MFEM, PETSc, and Nek5000. Nek5000 v18.0 or greater is required.
101
102To build the examples, set the `MFEM_DIR`, `PETSC_DIR` and `NEK5K_DIR` variables
103and run:
104
105```console
106# libCEED examples on CPU and GPU
107cd examples/ceed
108make
109./ex1-volume -ceed /cpu/self
110./ex1-volume -ceed /gpu/occa
111./ex2-surface -ceed /cpu/self
112./ex2-surface -ceed /gpu/occa
113cd ../..
114
115# MFEM+libCEED examples on CPU and GPU
116cd examples/mfem
117make
118./bp1 -ceed /cpu/self -no-vis
119./bp3 -ceed /gpu/occa -no-vis
120cd ../..
121
122# Nek5000+libCEED examples on CPU and GPU
123cd examples/nek
124make
125./nek-examples.sh -e bp1 -ceed /cpu/self -b 3
126./nek-examples.sh -e bp3 -ceed /gpu/occa -b 3
127cd ../..
128
129# PETSc+libCEED examples on CPU and GPU
130cd examples/petsc
131make
132./bps -problem bp1 -ceed /cpu/self
133./bps -problem bp2 -ceed /gpu/occa
134./bps -problem bp3 -ceed /cpu/self
135./bps -problem bp4 -ceed /gpu/occa
136./bps -problem bp5 -ceed /cpu/self
137./bps -problem bp6 -ceed /gpu/occa
138cd ../..
139
140cd examples/petsc
141./area -problem cube -ceed /cpu/self -petscspace_degree 3
142./area -problem cube -ceed /gpu/occa -petscspace_degree 3
143./area -problem sphere -ceed /cpu/self -petscspace_degree 3 -dm_refine 2
144./area -problem sphere -ceed /gpu/occa -petscspace_degree 3 -dm_refine 2
145cd ../..
146
147cd examples/navier-stokes
148make
149./navierstokes -ceed /cpu/self
150./navierstokes -ceed /gpu/occa
151cd ../..
152```
153
154The above code assumes a GPU-capable machine with the OCCA backend
155enabled. Depending on the available backends, other Ceed resource specifiers can
156be provided with the `-ceed` option.
157
158## Benchmarks
159
160A sequence of benchmarks for all enabled backends can be run using
161
162```console
163make benchmarks
164```
165
166The results from the benchmarks are stored inside the `benchmarks/` directory
167and they can be viewed using the commands (requires python with matplotlib):
168
169```console
170cd benchmarks
171python postprocess-plot.py petsc-bps-bp1-*-output.txt
172python postprocess-plot.py petsc-bps-bp3-*-output.txt
173```
174
175Using the `benchmarks` target runs a comprehensive set of benchmarks which may
176take some time to run. Subsets of the benchmarks can be run using the scripts in the `benchmarks` folder.
177
178For more details about the benchmarks, see the `benchmarks/README.md` file.
179
180
181## Install
182
183To install libCEED, run
184
185    make install prefix=/usr/local
186
187or (e.g., if creating packages),
188
189    make install prefix=/usr DESTDIR=/packaging/path
190
191Note that along with the library, libCEED installs kernel sources, e.g. OCCA
192kernels are installed in `$prefix/lib/okl`. This allows the OCCA backend to
193build specialized kernels at run-time. In a normal setting, the kernel sources
194will be found automatically (relative to the library file `libceed.so`).
195However, if that fails (e.g. if `libceed.so` is moved), one can copy (cache) the
196kernel sources inside the user OCCA directory, `~/.occa` using
197
198    $(OCCA_DIR)/bin/occa cache ceed $(CEED_DIR)/lib/okl/*.okl
199
200This will allow OCCA to find the sources regardless of the location of the CEED
201library. One may occasionally need to clear the OCCA cache, which can be accomplished
202by removing the `~/.occa` directory or by calling `$(OCCA_DIR)/bin/occa clear -a`.
203
204To install libCEED for Python, run
205
206    python setup.py build install
207
208with the desired setuptools options, such as `--user`.
209
210Alternatively, if libCEED is installed in the directory specified by the
211environment variable `CEED_DIR`, then run
212
213    pip install .
214
215### pkg-config
216
217In addition to library and header, libCEED provides a [pkg-config][pkg-config1]
218file that can be used to easily compile and link. [For example][pkg-config2], if
219`$prefix` is a standard location or you set the environment variable
220`PKG_CONFIG_PATH`,
221
222    cc `pkg-config --cflags --libs ceed` -o myapp myapp.c
223
224will build `myapp` with libCEED.  This can be used with the source or
225installed directories.  Most build systems have support for pkg-config.
226
227## Contact
228
229You can reach the libCEED team by emailing [ceed-users@llnl.gov](mailto:ceed-users@llnl.gov)
230or by leaving a comment in the [issue tracker](https://github.com/CEED/libCEED/issues).
231
232## Copyright
233
234The following copyright applies to each file in the CEED software suite, unless
235otherwise stated in the file:
236
237> Copyright (c) 2017, Lawrence Livermore National Security, LLC. Produced at the
238> Lawrence Livermore National Laboratory. LLNL-CODE-734707. All Rights reserved.
239
240See files LICENSE and NOTICE for details.
241
242[ceed-soft]:   http://ceed.exascaleproject.org/software/
243[ecp]:         https://exascaleproject.org/exascale-computing-project
244[pkg-config1]: https://en.wikipedia.org/wiki/Pkg-config
245[pkg-config2]: https://people.freedesktop.org/~dbn/pkg-config-guide.html#faq
246