1# Getting Started 2 3## Building 4 5The CEED library, `libceed`, is a C99 library with no external dependencies. The library 6has Fortran and Python interfaces; see `interface/ceed-fortran.c` and 7`interface/ceed-python/`. It can be built using 8 9 make 10 11or, with optimization flags 12 13 make OPT='-O3 -march=skylake-avx512 -ffp-contract=fast' 14 15These optimization flags are used by all languages (C, C++, Fortran) and this 16makefile variable can also be set for testing and examples (below). 17 18The library attempts to automatically detect support for the AVX 19instruction set using gcc-style compiler options for the host. 20Support may need to be manually specified via 21 22 make AVX=1 23 24or 25 26 make AVX=0 27 28if your compiler does not support gcc-style options, if you are cross 29compiling, etc. 30 31## Testing 32 33The test suite produces [TAP](https://testanything.org) output and is run by: 34 35 make test 36 37or, using the `prove` tool distributed with Perl (recommended) 38 39 make prove 40 41## Backends 42 43There are multiple supported backends, which can be selected at runtime in the examples: 44 45| CEED resource | Backend | 46| :----------------------- | :------------------------------------------------ | 47| `/cpu/self/ref/serial` | Serial reference implementation | 48| `/cpu/self/ref/blocked` | Blocked refrence implementation | 49| `/cpu/self/memcheck` | Memcheck backend, undefined value checks | 50| `/cpu/self/opt/serial` | Serial optimized C implementation | 51| `/cpu/self/opt/blocked` | Blocked optimized C implementation | 52| `/cpu/self/avx/serial` | Serial AVX implementation | 53| `/cpu/self/avx/blocked` | Blocked AVX implementation | 54| `/cpu/self/xsmm/serial` | Serial LIBXSMM implementation | 55| `/cpu/self/xsmm/blocked` | Blocked LIBXSMM implementation | 56| `/cpu/occa` | Serial OCCA kernels | 57| `/gpu/occa` | CUDA OCCA kernels | 58| `/omp/occa` | OpenMP OCCA kernels | 59| `/ocl/occa` | OpenCL OCCA kernels | 60| `/gpu/cuda/ref` | Reference pure CUDA kernels | 61| `/gpu/cuda/reg` | Pure CUDA kernels using one thread per element | 62| `/gpu/cuda/shared` | Optimized pure CUDA kernels using shared memory | 63| `/gpu/cuda/gen` | Optimized pure CUDA kernels using code generation | 64| `/gpu/magma` | CUDA MAGMA kernels | 65 66The `/cpu/self/*/serial` backends process one element at a time and are intended for meshes 67with a smaller number of high order elements. The `/cpu/self/*/blocked` backends process 68blocked batches of eight interlaced elements and are intended for meshes with higher numbers 69of elements. 70 71The `/cpu/self/ref/*` backends are written in pure C and provide basic functionality. 72 73The `/cpu/self/opt/*` backends are written in pure C and use partial e-vectors to improve performance. 74 75The `/cpu/self/avx/*` backends rely upon AVX instructions to provide vectorized CPU performance. 76 77The `/cpu/self/xsmm/*` backends rely upon the [LIBXSMM](http://github.com/hfp/libxsmm) package 78to provide vectorized CPU performance. If linking MKL and LIBXSMM is desired but 79the Makefile is not detecting `MKLROOT`, linking libCEED against MKL can be 80forced by setting the environment variable `MKL=1`. 81 82The `/cpu/self/memcheck/*` backends rely upon the [Valgrind](http://valgrind.org/) Memcheck tool 83to help verify that user QFunctions have no undefined values. To use, run your code with 84Valgrind and the Memcheck backends, e.g. `valgrind ./build/ex1 -ceed /cpu/self/ref/memcheck`. A 85'development' or 'debugging' version of Valgrind with headers is required to use this backend. 86This backend can be run in serial or blocked mode and defaults to running in the serial mode 87if `/cpu/self/memcheck` is selected at runtime. 88 89The `/*/occa` backends rely upon the [OCCA](http://github.com/libocca/occa) package to provide 90cross platform performance. 91 92The `/gpu/cuda/*` backends provide GPU performance strictly using CUDA. 93 94The `/gpu/magma` backend relies upon the [MAGMA](https://bitbucket.org/icl/magma) package. 95 96## Examples 97 98libCEED comes with several examples of its usage, ranging from standalone C 99codes in the `/examples/ceed` directory to examples based on external packages, 100such as MFEM, PETSc, and Nek5000. Nek5000 v18.0 or greater is required. 101 102To build the examples, set the `MFEM_DIR`, `PETSC_DIR` and `NEK5K_DIR` variables 103and run: 104 105```console 106# libCEED examples on CPU and GPU 107cd examples/ceed 108make 109./ex1-volume -ceed /cpu/self 110./ex1-volume -ceed /gpu/occa 111./ex2-surface -ceed /cpu/self 112./ex2-surface -ceed /gpu/occa 113cd ../.. 114 115# MFEM+libCEED examples on CPU and GPU 116cd examples/mfem 117make 118./bp1 -ceed /cpu/self -no-vis 119./bp3 -ceed /gpu/occa -no-vis 120cd ../.. 121 122# Nek5000+libCEED examples on CPU and GPU 123cd examples/nek 124make 125./nek-examples.sh -e bp1 -ceed /cpu/self -b 3 126./nek-examples.sh -e bp3 -ceed /gpu/occa -b 3 127cd ../.. 128 129# PETSc+libCEED examples on CPU and GPU 130cd examples/petsc 131make 132./bps -problem bp1 -ceed /cpu/self 133./bps -problem bp2 -ceed /gpu/occa 134./bps -problem bp3 -ceed /cpu/self 135./bps -problem bp4 -ceed /gpu/occa 136./bps -problem bp5 -ceed /cpu/self 137./bps -problem bp6 -ceed /gpu/occa 138cd ../.. 139 140cd examples/petsc 141./area -problem cube -ceed /cpu/self -petscspace_degree 3 142./area -problem cube -ceed /gpu/occa -petscspace_degree 3 143./area -problem sphere -ceed /cpu/self -petscspace_degree 3 -dm_refine 2 144./area -problem sphere -ceed /gpu/occa -petscspace_degree 3 -dm_refine 2 145cd ../.. 146 147cd examples/navier-stokes 148make 149./navierstokes -ceed /cpu/self 150./navierstokes -ceed /gpu/occa 151cd ../.. 152``` 153 154The above code assumes a GPU-capable machine with the OCCA backend 155enabled. Depending on the available backends, other Ceed resource specifiers can 156be provided with the `-ceed` option. 157 158## Benchmarks 159 160A sequence of benchmarks for all enabled backends can be run using 161 162```console 163make benchmarks 164``` 165 166The results from the benchmarks are stored inside the `benchmarks/` directory 167and they can be viewed using the commands (requires python with matplotlib): 168 169```console 170cd benchmarks 171python postprocess-plot.py petsc-bps-bp1-*-output.txt 172python postprocess-plot.py petsc-bps-bp3-*-output.txt 173``` 174 175Using the `benchmarks` target runs a comprehensive set of benchmarks which may 176take some time to run. Subsets of the benchmarks can be run using the scripts in the `benchmarks` folder. 177 178For more details about the benchmarks, see the `benchmarks/README.md` file. 179 180 181## Install 182 183To install libCEED, run 184 185 make install prefix=/usr/local 186 187or (e.g., if creating packages), 188 189 make install prefix=/usr DESTDIR=/packaging/path 190 191Note that along with the library, libCEED installs kernel sources, e.g. OCCA 192kernels are installed in `$prefix/lib/okl`. This allows the OCCA backend to 193build specialized kernels at run-time. In a normal setting, the kernel sources 194will be found automatically (relative to the library file `libceed.so`). 195However, if that fails (e.g. if `libceed.so` is moved), one can copy (cache) the 196kernel sources inside the user OCCA directory, `~/.occa` using 197 198 $(OCCA_DIR)/bin/occa cache ceed $(CEED_DIR)/lib/okl/*.okl 199 200This will allow OCCA to find the sources regardless of the location of the CEED 201library. One may occasionally need to clear the OCCA cache, which can be accomplished 202by removing the `~/.occa` directory or by calling `$(OCCA_DIR)/bin/occa clear -a`. 203 204To install libCEED for Python, run 205 206 python setup.py build install 207 208with the desired setuptools options, such as `--user`. 209 210Alternatively, if libCEED is installed in the directory specified by the 211environment variable `CEED_DIR`, then run 212 213 pip install . 214 215### pkg-config 216 217In addition to library and header, libCEED provides a [pkg-config][pkg-config1] 218file that can be used to easily compile and link. [For example][pkg-config2], if 219`$prefix` is a standard location or you set the environment variable 220`PKG_CONFIG_PATH`, 221 222 cc `pkg-config --cflags --libs ceed` -o myapp myapp.c 223 224will build `myapp` with libCEED. This can be used with the source or 225installed directories. Most build systems have support for pkg-config. 226 227## Contact 228 229You can reach the libCEED team by emailing [ceed-users@llnl.gov](mailto:ceed-users@llnl.gov) 230or by leaving a comment in the [issue tracker](https://github.com/CEED/libCEED/issues). 231 232## Copyright 233 234The following copyright applies to each file in the CEED software suite, unless 235otherwise stated in the file: 236 237> Copyright (c) 2017, Lawrence Livermore National Security, LLC. Produced at the 238> Lawrence Livermore National Laboratory. LLNL-CODE-734707. All Rights reserved. 239 240See files LICENSE and NOTICE for details. 241 242[ceed-soft]: http://ceed.exascaleproject.org/software/ 243[ecp]: https://exascaleproject.org/exascale-computing-project 244[pkg-config1]: https://en.wikipedia.org/wiki/Pkg-config 245[pkg-config2]: https://people.freedesktop.org/~dbn/pkg-config-guide.html#faq 246