1*34b254c5SRichard Tran Mills(ch_regressor)= 2*34b254c5SRichard Tran Mills 3*34b254c5SRichard Tran Mills# PetscRegressor: Regression Solvers 4*34b254c5SRichard Tran Mills 5*34b254c5SRichard Tran MillsThe ``PetscRegressor`` library provides a framework for the scalable solution of 6*34b254c5SRichard Tran Millsregression and classification problems. Methods are available for 7*34b254c5SRichard Tran Mills 8*34b254c5SRichard Tran Mills- {any}`sec_regressor_linear` 9*34b254c5SRichard Tran Mills 10*34b254c5SRichard Tran MillsNote that by regressor, we mean an algorithm or implementation 11*34b254c5SRichard Tran Millsused to fit a regression model, following notation from machine-learning community. 12*34b254c5SRichard Tran MillsRegressor here does NOT mean independent (or predictor) variable, as it often does in the 13*34b254c5SRichard Tran Millsstatistics community. 14*34b254c5SRichard Tran Mills 15*34b254c5SRichard Tran Mills(sec_regressor_usage)= 16*34b254c5SRichard Tran Mills 17*34b254c5SRichard Tran Mills## Basic Regressor Usage 18*34b254c5SRichard Tran Mills 19*34b254c5SRichard Tran MillsHere, we introduce a simple example to demonstrate `PetscRegressor` usage. 20*34b254c5SRichard Tran MillsPlease read {any}`sec_regressor_solvers` for more in-depth discussion. 21*34b254c5SRichard Tran MillsThe code presented {any}`below <regressor-ex3>` solves ordinary linear 22*34b254c5SRichard Tran Millsregressoion problem, with various regularization options. 23*34b254c5SRichard Tran Mills 24*34b254c5SRichard Tran MillsIn the simplest usage of the regressor solver, the user simply needs to 25*34b254c5SRichard Tran Millsprovide target matrix (`Mat`), and a target vector (`Vec`) to fit 26*34b254c5SRichard Tran Millsthe regressor against. With fitted regressor, then the user can obtain 27*34b254c5SRichard Tran Millspredicted value vector. 28*34b254c5SRichard Tran Mills 29*34b254c5SRichard Tran MillsPETSc's default method for solving regression problem is ordinary least squares, 30*34b254c5SRichard Tran Mills`REGRESSOR_LINEAR_OLS`, which is a sub-type of linear regressor, 31*34b254c5SRichard Tran Mills`PETSCREGRESSORLINEAR`. 32*34b254c5SRichard Tran Mills 33*34b254c5SRichard Tran MillsNote that data creation, option parsings, and cleaning stages are omiited for 34*34b254c5SRichard Tran Millsdisplay purposes. The complete code is available in {ref}`ex3.c <regressor-ex3>`. 35*34b254c5SRichard Tran Mills 36*34b254c5SRichard Tran Mills(regressor-ex3)= 37*34b254c5SRichard Tran Mills:::{admonition} Listing: `src/ml/regressor/tests/ex3.c` 38*34b254c5SRichard Tran Mills```{literalinclude} /../src/ml/regressor/tests/ex3.c 39*34b254c5SRichard Tran Mills:prepend: '#include <petscregressor.h>' 40*34b254c5SRichard Tran Mills:start-at: int main 41*34b254c5SRichard Tran Mills:end-at: PetscFinalize 42*34b254c5SRichard Tran Mills:append: return 0;} 43*34b254c5SRichard Tran Mills``` 44*34b254c5SRichard Tran Mills::: 45*34b254c5SRichard Tran Mills 46*34b254c5SRichard Tran MillsTo create a `PetscRegressor` solver, one must first call `PetscRegressorCreate()` 47*34b254c5SRichard Tran Millsas follows: 48*34b254c5SRichard Tran Mills 49*34b254c5SRichard Tran Mills``` 50*34b254c5SRichard Tran MillsPetscRegressorCreate(MPI_Comm comm, PetscRegressor *regressor); 51*34b254c5SRichard Tran Mills``` 52*34b254c5SRichard Tran Mills 53*34b254c5SRichard Tran MillsTo choose a solver type, the user can either call 54*34b254c5SRichard Tran Mills 55*34b254c5SRichard Tran Mills``` 56*34b254c5SRichard Tran MillsPetscRegressorSetType(PetscRegressor regressor, PetscRegressorType type); 57*34b254c5SRichard Tran Mills``` 58*34b254c5SRichard Tran Mills 59*34b254c5SRichard Tran Millsor use the option `-regressor_type <method>`, where details regarding the 60*34b254c5SRichard Tran Millsavailable methods are presented in {any}`sec_regressor_solvers`. 61*34b254c5SRichard Tran MillsThe application code can take complete control of the linear and nonlinear 62*34b254c5SRichard Tran Millstechniques used in the Newton-like method by calling 63*34b254c5SRichard Tran Mills 64*34b254c5SRichard Tran Mills``` 65*34b254c5SRichard Tran MillsPetscRegressorSetFromOptions(regressor); 66*34b254c5SRichard Tran Mills``` 67*34b254c5SRichard Tran Mills 68*34b254c5SRichard Tran MillsThis routine provides an interface to the PETSc options database, so 69*34b254c5SRichard Tran Millsthat at runtime the user can select a particular regression solver, set 70*34b254c5SRichard Tran Millsvarious parameters and customized routines. With this routine the user 71*34b254c5SRichard Tran Millscan also control all inner solver options in the `KSP`, and `Tao` 72*34b254c5SRichard Tran Millsmodules, as discussed in {any}`ch_ksp`, {any}`ch_tao`. 73*34b254c5SRichard Tran Mills 74*34b254c5SRichard Tran MillsAfter having set these routines and options, the user can fit the problem 75*34b254c5SRichard Tran Millsby calling 76*34b254c5SRichard Tran Mills 77*34b254c5SRichard Tran Mills``` 78*34b254c5SRichard Tran MillsPetscRegressorFit(PetscRegressor regressor, Mat X, Vec y); 79*34b254c5SRichard Tran Mills``` 80*34b254c5SRichard Tran Mills 81*34b254c5SRichard Tran Millswhere `X` is training data, and `y` is target values. 82*34b254c5SRichard Tran MillsFinally, after fitting the regressor solver, the user can compute 83*34b254c5SRichard Tran Millsprediction, that is, perform inference, using a fitted regressor. 84*34b254c5SRichard Tran Mills 85*34b254c5SRichard Tran Mills``` 86*34b254c5SRichard Tran MillsPetscRegressorPredict(PetscRegressor regressor, Mat X, Vec y_predicted); 87*34b254c5SRichard Tran Mills``` 88*34b254c5SRichard Tran Mills 89*34b254c5SRichard Tran MillsFinally, after the user is done using predicting the regressor solver, 90*34b254c5SRichard Tran Millsthe user should destroy the `PetscRegressor` context with 91*34b254c5SRichard Tran Mills 92*34b254c5SRichard Tran Mills``` 93*34b254c5SRichard Tran MillsPetscRegressorDestroy(PetscRegressor *regressor); 94*34b254c5SRichard Tran Mills``` 95*34b254c5SRichard Tran Mills 96*34b254c5SRichard Tran MillsNote that the user should not destroy `y_predicted` from previous section, 97*34b254c5SRichard Tran Millsas this is done internally. 98*34b254c5SRichard Tran Mills 99*34b254c5SRichard Tran Mills(sec_regressor_solvers)= 100*34b254c5SRichard Tran Mills 101*34b254c5SRichard Tran Mills## Regression Solvers 102*34b254c5SRichard Tran Mills 103*34b254c5SRichard Tran MillsOne can see the list of regressor solver types in Table 104*34b254c5SRichard Tran Mills{any}`tab-regressordefaults`. Currently, we only support one type, 105*34b254c5SRichard Tran Mills`PETSCREGRESSORLINEAR`. 106*34b254c5SRichard Tran Mills 107*34b254c5SRichard Tran Mills```{eval-rst} 108*34b254c5SRichard Tran Mills.. list-table:: PETSc Regressor 109*34b254c5SRichard Tran Mills :name: tab-regressordefaults 110*34b254c5SRichard Tran Mills :header-rows: 1 111*34b254c5SRichard Tran Mills 112*34b254c5SRichard Tran Mills * - Method 113*34b254c5SRichard Tran Mills - PetscRegressorType 114*34b254c5SRichard Tran Mills - Options Name 115*34b254c5SRichard Tran Mills * - Linear 116*34b254c5SRichard Tran Mills - ``PETSCREGRESSORLINEAR`` 117*34b254c5SRichard Tran Mills - ``linear`` 118*34b254c5SRichard Tran Mills``` 119*34b254c5SRichard Tran Mills 120*34b254c5SRichard Tran MillsIf the particular method that the user is using supports regularizer, 121*34b254c5SRichard Tran Millsthe user can set regularizer's weight via 122*34b254c5SRichard Tran Mills 123*34b254c5SRichard Tran Mills``` 124*34b254c5SRichard Tran MillsPetscRegressorSetRegularizerWeight(PetscRegressor regressor, PetscReal weight); 125*34b254c5SRichard Tran Mills``` 126*34b254c5SRichard Tran Mills 127*34b254c5SRichard Tran Millsor with the option `-regresor_regularizer_weight <weight>`. 128*34b254c5SRichard Tran Mills 129*34b254c5SRichard Tran Mills(sec_regressor_linear)= 130*34b254c5SRichard Tran Mills 131*34b254c5SRichard Tran Mills## Linear regressor 132*34b254c5SRichard Tran Mills 133*34b254c5SRichard Tran MillsThe method `PETSCREGRESSORLINEAR` (`-regressor_type linear`) 134*34b254c5SRichard Tran Millsconstructs a linear model to reduce the sum of squared differences 135*34b254c5SRichard Tran Millsbetween the actual target values in the dataset and the target 136*34b254c5SRichard Tran Millsvalues estimated by the linear approximation. By default, 137*34b254c5SRichard Tran Millsthis method will use bound-constrained regularized Gauss-Newton 138*34b254c5SRichard Tran Mills`TAOBRGN` to solve the regression problem. 139*34b254c5SRichard Tran Mills 140*34b254c5SRichard Tran MillsCurrently, linear regressor has three types, which are described 141*34b254c5SRichard Tran Millsin Table {any}`tab-lineartypes`. 142*34b254c5SRichard Tran Mills 143*34b254c5SRichard Tran Mills```{eval-rst} 144*34b254c5SRichard Tran Mills.. list-table:: Linear Regressor types 145*34b254c5SRichard Tran Mills :name: tab-lineartypes 146*34b254c5SRichard Tran Mills :header-rows: 1 147*34b254c5SRichard Tran Mills 148*34b254c5SRichard Tran Mills * - Linear method 149*34b254c5SRichard Tran Mills - ``PetscRegressorLinearType`` 150*34b254c5SRichard Tran Mills - Options Name 151*34b254c5SRichard Tran Mills * - Ordinary 152*34b254c5SRichard Tran Mills - ``REGRESSOR_LINEAR_OLS`` 153*34b254c5SRichard Tran Mills - ``ols`` 154*34b254c5SRichard Tran Mills * - Lasso 155*34b254c5SRichard Tran Mills - ``REGRESSOR_LINEAR_LASSO`` 156*34b254c5SRichard Tran Mills - ``lasso`` 157*34b254c5SRichard Tran Mills * - Ridge 158*34b254c5SRichard Tran Mills - ``REGRESSOR_LINEAR_RIDGE`` 159*34b254c5SRichard Tran Mills - ``ridge`` 160*34b254c5SRichard Tran Mills``` 161*34b254c5SRichard Tran Mills 162*34b254c5SRichard Tran MillsIf one wishes, the user can (when appropriate) use `KSP` to solve the problem, instead of `Tao`, 163*34b254c5SRichard Tran Millsvia 164*34b254c5SRichard Tran Mills 165*34b254c5SRichard Tran Mills``` 166*34b254c5SRichard Tran MillsPetscRegressorLinearSetUseKSP(PetscRegressor regressor, PetscBool flg); 167*34b254c5SRichard Tran Mills``` 168*34b254c5SRichard Tran Mills 169*34b254c5SRichard Tran Millsor with the option `-regressor_linear_use_ksp <true,false>`. 170*34b254c5SRichard Tran Mills 171*34b254c5SRichard Tran MillsThe user can also compute the intercept, also known as the bias or offset), via 172*34b254c5SRichard Tran Mills 173*34b254c5SRichard Tran Mills``` 174*34b254c5SRichard Tran MillsPetscRegressorLinearSetFitIntercept(PetscRegressor regressor, PetscBool flg); 175*34b254c5SRichard Tran Mills``` 176*34b254c5SRichard Tran Mills 177*34b254c5SRichard Tran Millsor with the option `-regressor_linear_fit_intercept <true,false>`. 178*34b254c5SRichard Tran Mills 179*34b254c5SRichard Tran MillsAfter the regressor has been fitted and predicted, one can obtain intercept and 180*34b254c5SRichard Tran Millsa vector of the fitted coefficients from a linear regression model. 181*34b254c5SRichard Tran Mills 182*34b254c5SRichard Tran Mills``` 183*34b254c5SRichard Tran MillsPetscRegressorLinearGetCoefficients(PetscRegressor regressor, Vec *coefficients); 184*34b254c5SRichard Tran MillsPetscRegressorLinearGetIntercept(PetscRegressor regressor, PetscScalar *intercept); 185*34b254c5SRichard Tran Mills``` 186