1# Getting Started 2 3## Building 4 5The CEED library, `libceed`, is a C99 library with no required dependencies, and 6with Fortran and Python interfaces. It can be built using 7 8 make 9 10or, with optimization flags 11 12 make OPT='-O3 -march=skylake-avx512 -ffp-contract=fast' 13 14These optimization flags are used by all languages (C, C++, Fortran) and this 15makefile variable can also be set for testing and examples (below). 16Python users can install using 17 18 pip install libceed 19 20or in a clone of the repository via ``pip install .``. 21 22The library attempts to automatically detect support for the AVX 23instruction set using gcc-style compiler options for the host. 24Support may need to be manually specified via 25 26 make AVX=1 27 28or 29 30 make AVX=0 31 32if your compiler does not support gcc-style options, if you are cross 33compiling, etc. 34 35## Testing 36 37The test suite produces [TAP](https://testanything.org) output and is run by: 38 39 make test 40 41or, using the `prove` tool distributed with Perl (recommended) 42 43 make prove 44 45## Backends 46 47There are multiple supported backends, which can be selected at runtime in the examples: 48 49| CEED resource | Backend | 50| :----------------------- | :------------------------------------------------ | 51| `/cpu/self/ref/serial` | Serial reference implementation | 52| `/cpu/self/ref/blocked` | Blocked refrence implementation | 53| `/cpu/self/memcheck` | Memcheck backend, undefined value checks | 54| `/cpu/self/opt/serial` | Serial optimized C implementation | 55| `/cpu/self/opt/blocked` | Blocked optimized C implementation | 56| `/cpu/self/avx/serial` | Serial AVX implementation | 57| `/cpu/self/avx/blocked` | Blocked AVX implementation | 58| `/cpu/self/xsmm/serial` | Serial LIBXSMM implementation | 59| `/cpu/self/xsmm/blocked` | Blocked LIBXSMM implementation | 60| `/cpu/occa` | Serial OCCA kernels | 61| `/gpu/occa` | CUDA OCCA kernels | 62| `/omp/occa` | OpenMP OCCA kernels | 63| `/ocl/occa` | OpenCL OCCA kernels | 64| `/gpu/cuda/ref` | Reference pure CUDA kernels | 65| `/gpu/cuda/reg` | Pure CUDA kernels using one thread per element | 66| `/gpu/cuda/shared` | Optimized pure CUDA kernels using shared memory | 67| `/gpu/cuda/gen` | Optimized pure CUDA kernels using code generation | 68| `/gpu/magma` | CUDA MAGMA kernels | 69 70The `/cpu/self/*/serial` backends process one element at a time and are intended for meshes 71with a smaller number of high order elements. The `/cpu/self/*/blocked` backends process 72blocked batches of eight interlaced elements and are intended for meshes with higher numbers 73of elements. 74 75The `/cpu/self/ref/*` backends are written in pure C and provide basic functionality. 76 77The `/cpu/self/opt/*` backends are written in pure C and use partial e-vectors to improve performance. 78 79The `/cpu/self/avx/*` backends rely upon AVX instructions to provide vectorized CPU performance. 80 81The `/cpu/self/xsmm/*` backends rely upon the [LIBXSMM](http://github.com/hfp/libxsmm) package 82to provide vectorized CPU performance. If linking MKL and LIBXSMM is desired but 83the Makefile is not detecting `MKLROOT`, linking libCEED against MKL can be 84forced by setting the environment variable `MKL=1`. 85 86The `/cpu/self/memcheck/*` backends rely upon the [Valgrind](http://valgrind.org/) Memcheck tool 87to help verify that user QFunctions have no undefined values. To use, run your code with 88Valgrind and the Memcheck backends, e.g. `valgrind ./build/ex1 -ceed /cpu/self/ref/memcheck`. A 89'development' or 'debugging' version of Valgrind with headers is required to use this backend. 90This backend can be run in serial or blocked mode and defaults to running in the serial mode 91if `/cpu/self/memcheck` is selected at runtime. 92 93The `/*/occa` backends rely upon the [OCCA](http://github.com/libocca/occa) package to provide 94cross platform performance. 95 96The `/gpu/cuda/*` backends provide GPU performance strictly using CUDA. 97 98The `/gpu/magma` backend relies upon the [MAGMA](https://bitbucket.org/icl/magma) package. 99 100## Examples 101 102libCEED comes with several examples of its usage, ranging from standalone C 103codes in the `/examples/ceed` directory to examples based on external packages, 104such as MFEM, PETSc, and Nek5000. Nek5000 v18.0 or greater is required. 105 106To build the examples, set the `MFEM_DIR`, `PETSC_DIR` and `NEK5K_DIR` variables 107and run: 108 109```console 110# libCEED examples on CPU and GPU 111cd examples/ceed 112make 113./ex1-volume -ceed /cpu/self 114./ex1-volume -ceed /gpu/occa 115./ex2-surface -ceed /cpu/self 116./ex2-surface -ceed /gpu/occa 117cd ../.. 118 119# MFEM+libCEED examples on CPU and GPU 120cd examples/mfem 121make 122./bp1 -ceed /cpu/self -no-vis 123./bp3 -ceed /gpu/occa -no-vis 124cd ../.. 125 126# Nek5000+libCEED examples on CPU and GPU 127cd examples/nek 128make 129./nek-examples.sh -e bp1 -ceed /cpu/self -b 3 130./nek-examples.sh -e bp3 -ceed /gpu/occa -b 3 131cd ../.. 132 133# PETSc+libCEED examples on CPU and GPU 134cd examples/petsc 135make 136./bps -problem bp1 -ceed /cpu/self 137./bps -problem bp2 -ceed /gpu/occa 138./bps -problem bp3 -ceed /cpu/self 139./bps -problem bp4 -ceed /gpu/occa 140./bps -problem bp5 -ceed /cpu/self 141./bps -problem bp6 -ceed /gpu/occa 142cd ../.. 143 144cd examples/petsc 145./area -problem cube -ceed /cpu/self -petscspace_degree 3 146./area -problem cube -ceed /gpu/occa -petscspace_degree 3 147./area -problem sphere -ceed /cpu/self -petscspace_degree 3 -dm_refine 2 148./area -problem sphere -ceed /gpu/occa -petscspace_degree 3 -dm_refine 2 149cd ../.. 150 151cd examples/navier-stokes 152make 153./navierstokes -ceed /cpu/self 154./navierstokes -ceed /gpu/occa 155cd ../.. 156``` 157 158The above code assumes a GPU-capable machine with the OCCA backend 159enabled. Depending on the available backends, other Ceed resource specifiers can 160be provided with the `-ceed` option. 161 162## Benchmarks 163 164A sequence of benchmarks for all enabled backends can be run using 165 166```console 167make benchmarks 168``` 169 170The results from the benchmarks are stored inside the `benchmarks/` directory 171and they can be viewed using the commands (requires python with matplotlib): 172 173```console 174cd benchmarks 175python postprocess-plot.py petsc-bps-bp1-*-output.txt 176python postprocess-plot.py petsc-bps-bp3-*-output.txt 177``` 178 179Using the `benchmarks` target runs a comprehensive set of benchmarks which may 180take some time to run. Subsets of the benchmarks can be run using the scripts in the `benchmarks` folder. 181 182For more details about the benchmarks, see the `benchmarks/README.md` file. 183 184 185## Install 186 187To install libCEED, run 188 189 make install prefix=/usr/local 190 191or (e.g., if creating packages), 192 193 make install prefix=/usr DESTDIR=/packaging/path 194 195Note that along with the library, libCEED installs kernel sources, e.g. OCCA 196kernels are installed in `$prefix/lib/okl`. This allows the OCCA backend to 197build specialized kernels at run-time. In a normal setting, the kernel sources 198will be found automatically (relative to the library file `libceed.so`). 199However, if that fails (e.g. if `libceed.so` is moved), one can copy (cache) the 200kernel sources inside the user OCCA directory, `~/.occa` using 201 202 $(OCCA_DIR)/bin/occa cache ceed $(CEED_DIR)/lib/okl/*.okl 203 204This will allow OCCA to find the sources regardless of the location of the CEED 205library. One may occasionally need to clear the OCCA cache, which can be accomplished 206by removing the `~/.occa` directory or by calling `$(OCCA_DIR)/bin/occa clear -a`. 207 208To install libCEED for Python, run 209 210 pip install . 211 212with the desired setuptools options, such as `--user`. 213 214### pkg-config 215 216In addition to library and header, libCEED provides a [pkg-config][pkg-config1] 217file that can be used to easily compile and link. [For example][pkg-config2], if 218`$prefix` is a standard location or you set the environment variable 219`PKG_CONFIG_PATH`, 220 221 cc `pkg-config --cflags --libs ceed` -o myapp myapp.c 222 223will build `myapp` with libCEED. This can be used with the source or 224installed directories. Most build systems have support for pkg-config. 225 226## Contact 227 228You can reach the libCEED team by emailing [ceed-users@llnl.gov](mailto:ceed-users@llnl.gov) 229or by leaving a comment in the [issue tracker](https://github.com/CEED/libCEED/issues). 230 231## Copyright 232 233The following copyright applies to each file in the CEED software suite, unless 234otherwise stated in the file: 235 236> Copyright (c) 2017, Lawrence Livermore National Security, LLC. Produced at the 237> Lawrence Livermore National Laboratory. LLNL-CODE-734707. All Rights reserved. 238 239See files LICENSE and NOTICE for details. 240 241[ceed-soft]: http://ceed.exascaleproject.org/software/ 242[ecp]: https://exascaleproject.org/exascale-computing-project 243[pkg-config1]: https://en.wikipedia.org/wiki/Pkg-config 244[pkg-config2]: https://people.freedesktop.org/~dbn/pkg-config-guide.html#faq 245