sphinx/source/libCEEDapi.md

5 (theoretical-framework)=
10 In particular, when high-order finite elements or spectral elements are used, the resulting sparse …
11 libCEED provides an interface for matrix-free operator description that enables efficient evaluatio…
15 We first define the $L^2$ inner product between real-valued functions
30 …$ represents all terms in {eq}`residual` which multiply the (possibly vector-valued) test function…
31 For an n-component problems in $d$ dimensions, $\bm f_0 \in \mathbb{R}^n$ and $\bm f_1 \in \mathbb{…
34 …dot represents contraction in just one, which should be clear from context, e.g., $\bm v \cdot \bm…
47 …er the mesh elements, mapping each element to a simple *reference* element (e.g. the unit square) …
51 …inuous ($H^1$) elements, where we use the notions **T-vector**, **L-vector**, **E-vector** and **Q…
55 - Subdomain restriction $\bm{P}$
56 - Element restriction $\bm{\mathcal{E}}$
57 - Basis (Dofs-to-Qpts) evaluator $\bm{B}$
58 - Operator at quadrature points $\bm{D}$
60 …t and trial space differ, they get their own versions of $\bm{P}$, $\bm{\mathcal{E}}$ and $\bm{B}$.
62 (fig-operator-decomp)=
64 :::{figure} ../../img/libCEED-decomposition.svg
68 …ent (AMR), the restrictions $\bm{P}$ and $\bm{\mathcal{E}}$ will involve not just extracting sub-v…
75 - True degrees of freedom/unknowns, **T-vector**:
77   > - each unknown $i$ has exactly one copy, on exactly one processor, $rank(i)$
78   > - this is a non-overlapping vector decomposition
79   > - usually includes any essential (fixed) DoFs.
81   > ```{image} ../../img/T-vector.svg
84 - Local (w.r.t. processors) degrees of freedom/unknowns, **L-vector**:
86   > - each unknown $i$ has exactly one copy on each processor that owns an element containing $i$
87 …> - this is an overlapping vector decomposition with overlaps only across different processors---t…
88 …> - the shared DoFs/unknowns are the overlapping DoFs, i.e. the ones that have more than one copy,…
90   > ```{image} ../../img/L-vector.svg
93 - Per element decomposition, **E-vector**:
95   > - each unknown $i$ has as many copies as the number of elements that contain $i$
96   > - usually, the copies of the unknowns are grouped by the element they belong to.
98   > ```{image} ../../img/E-vector.svg
101 - In the case of AMR with hanging nodes (giving rise to hanging DoFs):
103   > - the **L-vector** is enhanced with the hanging/dependent DoFs
104 …> - the additional hanging/dependent DoFs are duplicated when they are shared by multiple processo…
105 …> - this way, an **E-vector** can be derived from an **L-vector** without any communications and w…
106 …- in other words, an entry in an **E-vector** is obtained by copying an entry from the correspondi…
108   > ```{image} ../../img/L-vector-AMR.svg
111 - In the case of variable order spaces:
113 …> - the dependent DoFs (usually on the higher-order side of a face/edge) can be treated just like …
115 - Quadrature point vector, **Q-vector**:
117 …> - this is similar to **E-vector** where instead of DoFs, the vector represents values at quadrat…
119 - In many cases it is useful to distinguish two types of vectors:
121   > - **X-vector**, or **primal X-vector**, and **X'-vector**, or **dual X-vector**
122   > - here X can be any of the T, L, E, or Q categories
123   > - for example, the mass matrix operator maps a **T-vector** to a **T'-vector**
124   > - the solutions vector is a **T-vector**, and the RHS vector is a **T'-vector**
125 …> - using the parallel prolongation operator, one can map the solution **T-vector** to a solution …
129 - Full true-DoF parallel assembly, **TA**, or **A**:
131   > - ParCSR or similar format
132 …> - the T in TA indicates that the data format represents an operator from a **T-vector** to a **T…
134 - Full local assembly, **LA**:
136   > - CSR matrix on each rank
137 …> - the parallel prolongation operator, $\bm{P}$, (and its transpose) should use optimized matrix-…
138   > - note that $\bm{P}$ is the operator mapping T-vectors to L-vectors.
140 - Element matrix assembly, **EA**:
142   > - each element matrix is stored as a dense matrix
143   > - optimized element and parallel prolongation operators
144 …> - note that the element prolongation operator is the mapping from an **L-vector** to an **E-vect…
146 - Quadrature-point/partial assembly, **QA** or **PA**:
148   > - precompute and store $w\det(J)$ at all quadrature points in all mesh elements
149   > - the stored data can be viewed as a **Q-vector**.
151 - Unassembled option,  **UA** or **U**:
153   > - no assembly step
154 …- the action uses directly the mesh node coordinates, and assumes specific form of the coefficient…
158 …E}}$ and $\bm{P}$, starting from its point-wise kernel $\bm{D}$, a "matvec" with $\bm{A}$ can be p…
159 For example, one can compute and store a global matrix on **T-vector** level.
160 …n (**L-vector**) or element (**E-vector**) matrices and perform the action of $\bm{A}$ using matve…
161 While these options are natural for low-order discretizations, they are not a good fit for high-ord…
163 …r portions of it) and evaluate the actions of $\bm{P}$, $\bm{\mathcal{E}}$ and $\bm{B}$ on-the-fly.
164 Critically for performance, we take advantage of the tensor-product structure of the degrees of fre…
166 …l amount of memory transfers (with respect to the polynomial order) and near-optimal FLOPs for ope…
168 When desired, the setup phase may be done as a side-effect of evaluating a different operator, such…
173 …mathcal{E}}$ and $\bm{B}$, the operator evaluation is decoupled  on their ranges, so $\bm{P}$, $\b…
175 …pping of $\bm{A}$ on a parallel computer is to split the **T-vector** over MPI ranks (a non-overla…
177 …E}}$, $\bm{B}$ and $\bm{D}$ clearly separate the MPI parallelism in the operator ($\bm{P}$) from t…
178 …-- parallel (multi-device) linear algebra for $\bm{P}$, sparse (on-device) linear algebra for $\bm…
180 Currently in libCEED, it is assumed that the host application manages the global **T-vectors** and …
181 Our API is thus focused on the **L-vector** level, where the logical devices, which in the library …
186 The communications among the devices, e.g. required for applying the action of $\bm{P}$, are curren…
187 The interface is non-blocking for all operations involving more than O(1) data, allowing operations…
191 …E}}}$, $\bm{B}$, and $\bm{D}$ and the library provides *backend* implementations and coordinates t…
194 (fig-operator-schematic)=
197 …ide libCEED Operators, through backend implementations of $\bm{\bm{\mathcal{E}}}$, $\bm{B}$, and $…
202 The separation of the front- and backends enables applications to easily switch/try different backe…
205 …-term vision is to include a variety of backend implementations in libCEED, ranging from reference…
207 [ceed-ref.c](https://github.com/CEED/libCEED/blob/main/backends/ref/ceed-ref.c).
212 - **L-**, **E-** and **Q-vector** are represented as variables of type {ref}`CeedVector`.
213   (A backend may choose to operate incrementally without forming explicit **E-** or **Q-vectors**.)
214 - $\bm{\mathcal{E}}$ is represented as variable of type {ref}`CeedElemRestriction`.
215 - $\bm{B}$ is represented as variable of type {ref}`CeedBasis`.
216 - the action of $\bm{D}$ is represented as variable of type {ref}`CeedQFunction`.
217 - the overall operator $\bm{\mathcal{E}}^T \bm{B}^T \bm{D} \bm{B} \bm{\mathcal{E}}$ is represented …
219 …of a simple 1D mass matrix (cf. [tests/t500-operator.c](https://github.com/CEED/libCEED/blob/main/…
221 ```{literalinclude} ../../../tests/t500-operator.c
229 (fig-operator-schematic-mass)=
232 …ion of $\bm{\bm{\mathcal{E}}}$, $\bm{B}$, $\bm{D}$, and input/output vectors corresponding to the …
237 ```{literalinclude} ../../../tests/t500-operator.c
238 :end-at: CeedInit
240 :start-at: CeedInit
250 ```{literalinclude} ../../../tests/t500-operator.c
251 :end-before: //! [QFunction Create]
253 :start-after: //! [QFunction Create]
262 This is used by backends that support Just-In-Time (JIT) compilation (i.e., CUDA and HIP) to compil…
264 …tible arguments for {code}`math` library functions is required, and variable-length array (VLA) sy…
270 The evaluation mode (see {ref}`CeedBasis-Typedefs and Enumerations`) `CEED_EVAL_INTERP` for both in…
276 where $v$ are test functions (see the {ref}`theoretical-framework`).
285 For fields with derivatives, such as with the basis evaluation mode (see {ref}`CeedBasis-Typedefs a…
286 A 3-dimensional gradient on four components would therefore mean the field has a size of 12\.
290 Both basis operators use the same integration rule, which is Gauss-Legendre with 8 points (the `Q` …
292 ```{literalinclude} ../../../tests/t500-operator.c
293 :end-before: //! [Basis Create]
295 :start-after: //! [Basis Create]
301 The $\bm{\mathcal{E}}$ operators for the mesh nodes, `elem_restr_x`, and the unknown field, `elem_r…
304 ```{literalinclude} ../../../tests/t500-operator.c
305 :end-before: //! [ElemRestr Create]
307 :start-after: //! [ElemRestr Create]
310 ```{literalinclude} ../../../tests/t500-operator.c
311 :end-before: //! [ElemRestrU Create]
313 :start-after: //! [ElemRestrU Create]
317 This technique is used to provide no-copy interfaces in all contexts that involve problem-sized dat…
319 … store **E-vectors** (inter-element continuity has been subsumed by the parallel restriction $\bm{…
320 We plan to support other structured representations of $\bm{\mathcal{E}}$ which will be added accor…
321 …-conforming elements: applying the node constraints via $\bm P$ so that the **L-vector** can be pr…
324 These operations, $\bm{\mathcal{E}}$, $\bm{B}$, and $\bm{D}$, are combined with a {ref}`CeedOperato…
326 … matching field name, basis ($\bm{B}$), element restriction ($\bm{\mathcal{E}}$), and **L-vector**.
328 Otherwise the input/output will be read from/written to the specified **L-vector**.
334 ```{literalinclude} ../../../tests/t500-operator.c
335 :end-before: //! [Setup Create]
337 :start-after: //! [Setup Create]
340 ```{literalinclude} ../../../tests/t500-operator.c
341 :end-before: //! [Setup Set]
343 :start-after: //! [Setup Set]
346 ```{literalinclude} ../../../tests/t500-operator.c
347 :end-before: //! [Setup Apply]
349 :start-after: //! [Setup Apply]
352 …by operator `op_mass` and its {c:func}`CeedOperatorApply()` to the input **L-vector** `U` with out…
354 ```{literalinclude} ../../../tests/t500-operator.c
355 :end-before: //! [Operator Create]
357 :start-after: //! [Operator Create]
360 ```{literalinclude} ../../../tests/t500-operator.c
361 :end-before: //! [Operator Set]
363 :start-after: //! [Operator Set]
366 ```{literalinclude} ../../../tests/t500-operator.c
367 :end-before: //! [Operator Apply]
369 :start-after: //! [Operator Apply]
373 …r) for the operation to complete before returning from the function call, i.e. to make sure that t…
379 LibCEED provides a gallery of built-in {ref}`CeedQFunction`s in the {file}`gallery/` directory.
381 …th a simple 1D mass matrix (cf. [tests/t410-qfunction.c](https://github.com/CEED/libCEED/blob/main…
383 ```{literalinclude} ../../../tests/t410-qfunction.c
393 ```{literalinclude} ../../../backends/ref/ceed-ref.c
394 :end-before: //! [Register]
396 :start-after: //! [Register]