xref: /petsc/doc/manual/regressor.md (revision 34b254c57d2aa195261fbc0db2d1455fb6d091da)
1*34b254c5SRichard Tran Mills(ch_regressor)=
2*34b254c5SRichard Tran Mills
3*34b254c5SRichard Tran Mills# PetscRegressor: Regression Solvers
4*34b254c5SRichard Tran Mills
5*34b254c5SRichard Tran MillsThe ``PetscRegressor`` library provides a framework for the scalable solution of
6*34b254c5SRichard Tran Millsregression and classification problems. Methods are available for
7*34b254c5SRichard Tran Mills
8*34b254c5SRichard Tran Mills- {any}`sec_regressor_linear`
9*34b254c5SRichard Tran Mills
10*34b254c5SRichard Tran MillsNote that by regressor, we mean an algorithm or implementation
11*34b254c5SRichard Tran Millsused to fit a regression model, following notation from machine-learning community.
12*34b254c5SRichard Tran MillsRegressor here does NOT mean independent (or predictor) variable, as it often does in the
13*34b254c5SRichard Tran Millsstatistics community.
14*34b254c5SRichard Tran Mills
15*34b254c5SRichard Tran Mills(sec_regressor_usage)=
16*34b254c5SRichard Tran Mills
17*34b254c5SRichard Tran Mills## Basic Regressor Usage
18*34b254c5SRichard Tran Mills
19*34b254c5SRichard Tran MillsHere, we introduce a simple example to demonstrate `PetscRegressor` usage.
20*34b254c5SRichard Tran MillsPlease read {any}`sec_regressor_solvers` for more in-depth discussion.
21*34b254c5SRichard Tran MillsThe code presented {any}`below <regressor-ex3>` solves ordinary linear
22*34b254c5SRichard Tran Millsregressoion problem, with various regularization options.
23*34b254c5SRichard Tran Mills
24*34b254c5SRichard Tran MillsIn the simplest usage of the regressor solver, the user simply needs to
25*34b254c5SRichard Tran Millsprovide target matrix (`Mat`), and a target vector (`Vec`) to fit
26*34b254c5SRichard Tran Millsthe regressor against. With fitted regressor, then the user can obtain
27*34b254c5SRichard Tran Millspredicted value vector.
28*34b254c5SRichard Tran Mills
29*34b254c5SRichard Tran MillsPETSc's default method for solving regression problem is ordinary least squares,
30*34b254c5SRichard Tran Mills`REGRESSOR_LINEAR_OLS`, which is a sub-type of linear regressor,
31*34b254c5SRichard Tran Mills`PETSCREGRESSORLINEAR`.
32*34b254c5SRichard Tran Mills
33*34b254c5SRichard Tran MillsNote that data creation, option parsings, and cleaning stages are omiited for
34*34b254c5SRichard Tran Millsdisplay purposes. The complete code is available in {ref}`ex3.c <regressor-ex3>`.
35*34b254c5SRichard Tran Mills
36*34b254c5SRichard Tran Mills(regressor-ex3)=
37*34b254c5SRichard Tran Mills:::{admonition} Listing: `src/ml/regressor/tests/ex3.c`
38*34b254c5SRichard Tran Mills```{literalinclude} /../src/ml/regressor/tests/ex3.c
39*34b254c5SRichard Tran Mills:prepend: '#include <petscregressor.h>'
40*34b254c5SRichard Tran Mills:start-at: int main
41*34b254c5SRichard Tran Mills:end-at: PetscFinalize
42*34b254c5SRichard Tran Mills:append: return 0;}
43*34b254c5SRichard Tran Mills```
44*34b254c5SRichard Tran Mills:::
45*34b254c5SRichard Tran Mills
46*34b254c5SRichard Tran MillsTo create a `PetscRegressor` solver, one must first call `PetscRegressorCreate()`
47*34b254c5SRichard Tran Millsas follows:
48*34b254c5SRichard Tran Mills
49*34b254c5SRichard Tran Mills```
50*34b254c5SRichard Tran MillsPetscRegressorCreate(MPI_Comm comm, PetscRegressor *regressor);
51*34b254c5SRichard Tran Mills```
52*34b254c5SRichard Tran Mills
53*34b254c5SRichard Tran MillsTo choose a solver type, the user can either call
54*34b254c5SRichard Tran Mills
55*34b254c5SRichard Tran Mills```
56*34b254c5SRichard Tran MillsPetscRegressorSetType(PetscRegressor regressor, PetscRegressorType type);
57*34b254c5SRichard Tran Mills```
58*34b254c5SRichard Tran Mills
59*34b254c5SRichard Tran Millsor use the option `-regressor_type <method>`, where details regarding the
60*34b254c5SRichard Tran Millsavailable methods are presented in {any}`sec_regressor_solvers`.
61*34b254c5SRichard Tran MillsThe application code can take complete control of the linear and nonlinear
62*34b254c5SRichard Tran Millstechniques used in the Newton-like method by calling
63*34b254c5SRichard Tran Mills
64*34b254c5SRichard Tran Mills```
65*34b254c5SRichard Tran MillsPetscRegressorSetFromOptions(regressor);
66*34b254c5SRichard Tran Mills```
67*34b254c5SRichard Tran Mills
68*34b254c5SRichard Tran MillsThis routine provides an interface to the PETSc options database, so
69*34b254c5SRichard Tran Millsthat at runtime the user can select a particular regression solver, set
70*34b254c5SRichard Tran Millsvarious parameters and customized routines. With this routine the user
71*34b254c5SRichard Tran Millscan also control all inner solver options in the `KSP`, and `Tao`
72*34b254c5SRichard Tran Millsmodules, as discussed in {any}`ch_ksp`, {any}`ch_tao`.
73*34b254c5SRichard Tran Mills
74*34b254c5SRichard Tran MillsAfter having set these routines and options, the user can fit the problem
75*34b254c5SRichard Tran Millsby calling
76*34b254c5SRichard Tran Mills
77*34b254c5SRichard Tran Mills```
78*34b254c5SRichard Tran MillsPetscRegressorFit(PetscRegressor regressor, Mat X, Vec y);
79*34b254c5SRichard Tran Mills```
80*34b254c5SRichard Tran Mills
81*34b254c5SRichard Tran Millswhere `X` is training data, and `y` is target values.
82*34b254c5SRichard Tran MillsFinally, after fitting the regressor solver, the user can compute
83*34b254c5SRichard Tran Millsprediction, that is, perform inference, using a fitted regressor.
84*34b254c5SRichard Tran Mills
85*34b254c5SRichard Tran Mills```
86*34b254c5SRichard Tran MillsPetscRegressorPredict(PetscRegressor regressor, Mat X, Vec y_predicted);
87*34b254c5SRichard Tran Mills```
88*34b254c5SRichard Tran Mills
89*34b254c5SRichard Tran MillsFinally, after the user is done using predicting the regressor solver,
90*34b254c5SRichard Tran Millsthe user should destroy the `PetscRegressor` context with
91*34b254c5SRichard Tran Mills
92*34b254c5SRichard Tran Mills```
93*34b254c5SRichard Tran MillsPetscRegressorDestroy(PetscRegressor *regressor);
94*34b254c5SRichard Tran Mills```
95*34b254c5SRichard Tran Mills
96*34b254c5SRichard Tran MillsNote that the user should not destroy `y_predicted` from previous section,
97*34b254c5SRichard Tran Millsas this is done internally.
98*34b254c5SRichard Tran Mills
99*34b254c5SRichard Tran Mills(sec_regressor_solvers)=
100*34b254c5SRichard Tran Mills
101*34b254c5SRichard Tran Mills## Regression Solvers
102*34b254c5SRichard Tran Mills
103*34b254c5SRichard Tran MillsOne can see the list of regressor solver types in Table
104*34b254c5SRichard Tran Mills{any}`tab-regressordefaults`. Currently, we only support one type,
105*34b254c5SRichard Tran Mills`PETSCREGRESSORLINEAR`.
106*34b254c5SRichard Tran Mills
107*34b254c5SRichard Tran Mills```{eval-rst}
108*34b254c5SRichard Tran Mills.. list-table:: PETSc Regressor
109*34b254c5SRichard Tran Mills   :name: tab-regressordefaults
110*34b254c5SRichard Tran Mills   :header-rows: 1
111*34b254c5SRichard Tran Mills
112*34b254c5SRichard Tran Mills   * - Method
113*34b254c5SRichard Tran Mills     - PetscRegressorType
114*34b254c5SRichard Tran Mills     - Options Name
115*34b254c5SRichard Tran Mills   * - Linear
116*34b254c5SRichard Tran Mills     - ``PETSCREGRESSORLINEAR``
117*34b254c5SRichard Tran Mills     - ``linear``
118*34b254c5SRichard Tran Mills```
119*34b254c5SRichard Tran Mills
120*34b254c5SRichard Tran MillsIf the particular method that the user is using supports regularizer,
121*34b254c5SRichard Tran Millsthe user can set regularizer's weight via
122*34b254c5SRichard Tran Mills
123*34b254c5SRichard Tran Mills```
124*34b254c5SRichard Tran MillsPetscRegressorSetRegularizerWeight(PetscRegressor regressor, PetscReal weight);
125*34b254c5SRichard Tran Mills```
126*34b254c5SRichard Tran Mills
127*34b254c5SRichard Tran Millsor with the option `-regresor_regularizer_weight <weight>`.
128*34b254c5SRichard Tran Mills
129*34b254c5SRichard Tran Mills(sec_regressor_linear)=
130*34b254c5SRichard Tran Mills
131*34b254c5SRichard Tran Mills## Linear regressor
132*34b254c5SRichard Tran Mills
133*34b254c5SRichard Tran MillsThe method `PETSCREGRESSORLINEAR` (`-regressor_type linear`)
134*34b254c5SRichard Tran Millsconstructs a linear model to reduce the sum of squared differences
135*34b254c5SRichard Tran Millsbetween the actual target values in the dataset and the target
136*34b254c5SRichard Tran Millsvalues estimated by the linear approximation. By default,
137*34b254c5SRichard Tran Millsthis method will use bound-constrained regularized Gauss-Newton
138*34b254c5SRichard Tran Mills`TAOBRGN` to solve the regression problem.
139*34b254c5SRichard Tran Mills
140*34b254c5SRichard Tran MillsCurrently, linear regressor has three types, which are described
141*34b254c5SRichard Tran Millsin Table {any}`tab-lineartypes`.
142*34b254c5SRichard Tran Mills
143*34b254c5SRichard Tran Mills```{eval-rst}
144*34b254c5SRichard Tran Mills.. list-table:: Linear Regressor types
145*34b254c5SRichard Tran Mills   :name: tab-lineartypes
146*34b254c5SRichard Tran Mills   :header-rows: 1
147*34b254c5SRichard Tran Mills
148*34b254c5SRichard Tran Mills   * - Linear method
149*34b254c5SRichard Tran Mills     - ``PetscRegressorLinearType``
150*34b254c5SRichard Tran Mills     - Options Name
151*34b254c5SRichard Tran Mills   * - Ordinary
152*34b254c5SRichard Tran Mills     - ``REGRESSOR_LINEAR_OLS``
153*34b254c5SRichard Tran Mills     - ``ols``
154*34b254c5SRichard Tran Mills   * - Lasso
155*34b254c5SRichard Tran Mills     - ``REGRESSOR_LINEAR_LASSO``
156*34b254c5SRichard Tran Mills     - ``lasso``
157*34b254c5SRichard Tran Mills   * - Ridge
158*34b254c5SRichard Tran Mills     - ``REGRESSOR_LINEAR_RIDGE``
159*34b254c5SRichard Tran Mills     - ``ridge``
160*34b254c5SRichard Tran Mills```
161*34b254c5SRichard Tran Mills
162*34b254c5SRichard Tran MillsIf one wishes, the user can (when appropriate) use `KSP` to solve the problem, instead of `Tao`,
163*34b254c5SRichard Tran Millsvia
164*34b254c5SRichard Tran Mills
165*34b254c5SRichard Tran Mills```
166*34b254c5SRichard Tran MillsPetscRegressorLinearSetUseKSP(PetscRegressor regressor, PetscBool flg);
167*34b254c5SRichard Tran Mills```
168*34b254c5SRichard Tran Mills
169*34b254c5SRichard Tran Millsor with the option `-regressor_linear_use_ksp <true,false>`.
170*34b254c5SRichard Tran Mills
171*34b254c5SRichard Tran MillsThe user can also compute the intercept, also known as the bias or offset), via
172*34b254c5SRichard Tran Mills
173*34b254c5SRichard Tran Mills```
174*34b254c5SRichard Tran MillsPetscRegressorLinearSetFitIntercept(PetscRegressor regressor, PetscBool flg);
175*34b254c5SRichard Tran Mills```
176*34b254c5SRichard Tran Mills
177*34b254c5SRichard Tran Millsor with the option `-regressor_linear_fit_intercept <true,false>`.
178*34b254c5SRichard Tran Mills
179*34b254c5SRichard Tran MillsAfter the regressor has been fitted and predicted, one can obtain intercept and
180*34b254c5SRichard Tran Millsa vector of the fitted coefficients from a linear regression model.
181*34b254c5SRichard Tran Mills
182*34b254c5SRichard Tran Mills```
183*34b254c5SRichard Tran MillsPetscRegressorLinearGetCoefficients(PetscRegressor regressor, Vec *coefficients);
184*34b254c5SRichard Tran MillsPetscRegressorLinearGetIntercept(PetscRegressor regressor, PetscScalar *intercept);
185*34b254c5SRichard Tran Mills```
186