xref: /libCEED/doc/sphinx/source/gettingstarted.md (revision 25118b5f4f2ec4a800d461101d711b68900b0d81)
1# Getting Started
2
3## Building
4
5The CEED library, `libceed`, is a C99 library with no required dependencies, and
6with Fortran and Python interfaces.  It can be built using
7
8    make
9
10or, with optimization flags
11
12    make OPT='-O3 -march=skylake-avx512 -ffp-contract=fast'
13
14These optimization flags are used by all languages (C, C++, Fortran) and this
15makefile variable can also be set for testing and examples (below).
16Python users can install using
17
18    pip install libceed
19
20or in a clone of the repository via ``pip install .``.
21
22The library attempts to automatically detect support for the AVX
23instruction set using gcc-style compiler options for the host.
24Support may need to be manually specified via
25
26    make AVX=1
27
28or
29
30    make AVX=0
31
32if your compiler does not support gcc-style options, if you are cross
33compiling, etc.
34
35## Testing
36
37The test suite produces [TAP](https://testanything.org) output and is run by:
38
39    make test
40
41or, using the `prove` tool distributed with Perl (recommended)
42
43    make prove
44
45## Backends
46
47There are multiple supported backends, which can be selected at runtime in the examples:
48
49| CEED resource            | Backend                                           |
50| :----------------------- | :------------------------------------------------ |
51| `/cpu/self/ref/serial`   | Serial reference implementation                   |
52| `/cpu/self/ref/blocked`  | Blocked refrence implementation                   |
53| `/cpu/self/memcheck`     | Memcheck backend, undefined value checks          |
54| `/cpu/self/opt/serial`   | Serial optimized C implementation                 |
55| `/cpu/self/opt/blocked`  | Blocked optimized C implementation                |
56| `/cpu/self/avx/serial`   | Serial AVX implementation                         |
57| `/cpu/self/avx/blocked`  | Blocked AVX implementation                        |
58| `/cpu/self/xsmm/serial`  | Serial LIBXSMM implementation                     |
59| `/cpu/self/xsmm/blocked` | Blocked LIBXSMM implementation                    |
60| `/cpu/occa`              | Serial OCCA kernels                               |
61| `/gpu/occa`              | CUDA OCCA kernels                                 |
62| `/omp/occa`              | OpenMP OCCA kernels                               |
63| `/ocl/occa`              | OpenCL OCCA kernels                               |
64| `/gpu/cuda/ref`          | Reference pure CUDA kernels                       |
65| `/gpu/cuda/reg`          | Pure CUDA kernels using one thread per element    |
66| `/gpu/cuda/shared`       | Optimized pure CUDA kernels using shared memory   |
67| `/gpu/cuda/gen`          | Optimized pure CUDA kernels using code generation |
68| `/gpu/magma`             | CUDA MAGMA kernels                                |
69
70The `/cpu/self/*/serial` backends process one element at a time and are intended for meshes
71with a smaller number of high order elements. The `/cpu/self/*/blocked` backends process
72blocked batches of eight interlaced elements and are intended for meshes with higher numbers
73of elements.
74
75The `/cpu/self/ref/*` backends are written in pure C and provide basic functionality.
76
77The `/cpu/self/opt/*` backends are written in pure C and use partial e-vectors to improve performance.
78
79The `/cpu/self/avx/*` backends rely upon AVX instructions to provide vectorized CPU performance.
80
81The `/cpu/self/xsmm/*` backends rely upon the [LIBXSMM](http://github.com/hfp/libxsmm) package
82to provide vectorized CPU performance. If linking MKL and LIBXSMM is desired but
83the Makefile is not detecting `MKLROOT`, linking libCEED against MKL can be
84forced by setting the environment variable `MKL=1`.
85
86The `/cpu/self/memcheck/*` backends rely upon the [Valgrind](http://valgrind.org/) Memcheck tool
87to help verify that user QFunctions have no undefined values. To use, run your code with
88Valgrind and the Memcheck backends, e.g. `valgrind ./build/ex1 -ceed /cpu/self/ref/memcheck`. A
89'development' or 'debugging' version of Valgrind with headers is required to use this backend.
90This backend can be run in serial or blocked mode and defaults to running in the serial mode
91if `/cpu/self/memcheck` is selected at runtime.
92
93The `/*/occa` backends rely upon the [OCCA](http://github.com/libocca/occa) package to provide
94cross platform performance.
95
96The `/gpu/cuda/*` backends provide GPU performance strictly using CUDA.
97
98The `/gpu/magma` backend relies upon the [MAGMA](https://bitbucket.org/icl/magma) package.
99
100## Examples
101
102libCEED comes with several examples of its usage, ranging from standalone C
103codes in the `/examples/ceed` directory to examples based on external packages,
104such as MFEM, PETSc, and Nek5000. Nek5000 v18.0 or greater is required.
105
106To build the examples, set the `MFEM_DIR`, `PETSC_DIR` and `NEK5K_DIR` variables
107and run:
108
109```console
110# libCEED examples on CPU and GPU
111cd examples/ceed
112make
113./ex1-volume -ceed /cpu/self
114./ex1-volume -ceed /gpu/occa
115./ex2-surface -ceed /cpu/self
116./ex2-surface -ceed /gpu/occa
117cd ../..
118
119# MFEM+libCEED examples on CPU and GPU
120cd examples/mfem
121make
122./bp1 -ceed /cpu/self -no-vis
123./bp3 -ceed /gpu/occa -no-vis
124cd ../..
125
126# Nek5000+libCEED examples on CPU and GPU
127cd examples/nek
128make
129./nek-examples.sh -e bp1 -ceed /cpu/self -b 3
130./nek-examples.sh -e bp3 -ceed /gpu/occa -b 3
131cd ../..
132
133# PETSc+libCEED examples on CPU and GPU
134cd examples/petsc
135make
136./bps -problem bp1 -ceed /cpu/self
137./bps -problem bp2 -ceed /gpu/occa
138./bps -problem bp3 -ceed /cpu/self
139./bps -problem bp4 -ceed /gpu/occa
140./bps -problem bp5 -ceed /cpu/self
141./bps -problem bp6 -ceed /gpu/occa
142cd ../..
143
144cd examples/petsc
145./area -problem cube -ceed /cpu/self -petscspace_degree 3
146./area -problem cube -ceed /gpu/occa -petscspace_degree 3
147./area -problem sphere -ceed /cpu/self -petscspace_degree 3 -dm_refine 2
148./area -problem sphere -ceed /gpu/occa -petscspace_degree 3 -dm_refine 2
149cd ../..
150
151cd examples/navier-stokes
152make
153./navierstokes -ceed /cpu/self
154./navierstokes -ceed /gpu/occa
155cd ../..
156```
157
158The above code assumes a GPU-capable machine with the OCCA backend
159enabled. Depending on the available backends, other Ceed resource specifiers can
160be provided with the `-ceed` option.
161
162## Benchmarks
163
164A sequence of benchmarks for all enabled backends can be run using
165
166```console
167make benchmarks
168```
169
170The results from the benchmarks are stored inside the `benchmarks/` directory
171and they can be viewed using the commands (requires python with matplotlib):
172
173```console
174cd benchmarks
175python postprocess-plot.py petsc-bps-bp1-*-output.txt
176python postprocess-plot.py petsc-bps-bp3-*-output.txt
177```
178
179Using the `benchmarks` target runs a comprehensive set of benchmarks which may
180take some time to run. Subsets of the benchmarks can be run using the scripts in the `benchmarks` folder.
181
182For more details about the benchmarks, see the `benchmarks/README.md` file.
183
184
185## Install
186
187To install libCEED, run
188
189    make install prefix=/usr/local
190
191or (e.g., if creating packages),
192
193    make install prefix=/usr DESTDIR=/packaging/path
194
195Note that along with the library, libCEED installs kernel sources, e.g. OCCA
196kernels are installed in `$prefix/lib/okl`. This allows the OCCA backend to
197build specialized kernels at run-time. In a normal setting, the kernel sources
198will be found automatically (relative to the library file `libceed.so`).
199However, if that fails (e.g. if `libceed.so` is moved), one can copy (cache) the
200kernel sources inside the user OCCA directory, `~/.occa` using
201
202    $(OCCA_DIR)/bin/occa cache ceed $(CEED_DIR)/lib/okl/*.okl
203
204This will allow OCCA to find the sources regardless of the location of the CEED
205library. One may occasionally need to clear the OCCA cache, which can be accomplished
206by removing the `~/.occa` directory or by calling `$(OCCA_DIR)/bin/occa clear -a`.
207
208To install libCEED for Python, run
209
210    pip install .
211
212with the desired setuptools options, such as `--user`.
213
214### pkg-config
215
216In addition to library and header, libCEED provides a [pkg-config][pkg-config1]
217file that can be used to easily compile and link. [For example][pkg-config2], if
218`$prefix` is a standard location or you set the environment variable
219`PKG_CONFIG_PATH`,
220
221    cc `pkg-config --cflags --libs ceed` -o myapp myapp.c
222
223will build `myapp` with libCEED.  This can be used with the source or
224installed directories.  Most build systems have support for pkg-config.
225
226## Contact
227
228You can reach the libCEED team by emailing [ceed-users@llnl.gov](mailto:ceed-users@llnl.gov)
229or by leaving a comment in the [issue tracker](https://github.com/CEED/libCEED/issues).
230
231## Copyright
232
233The following copyright applies to each file in the CEED software suite, unless
234otherwise stated in the file:
235
236> Copyright (c) 2017, Lawrence Livermore National Security, LLC. Produced at the
237> Lawrence Livermore National Laboratory. LLNL-CODE-734707. All Rights reserved.
238
239See files LICENSE and NOTICE for details.
240
241[ceed-soft]:   http://ceed.exascaleproject.org/software/
242[ecp]:         https://exascaleproject.org/exascale-computing-project
243[pkg-config1]: https://en.wikipedia.org/wiki/Pkg-config
244[pkg-config2]: https://people.freedesktop.org/~dbn/pkg-config-guide.html#faq
245