Estimators
Online GAMLSS
rolch.OnlineGamlss
Bases: Estimator
The online/incremental GAMLSS class.
__init__
__init__(distribution: Distribution, equation: Dict, forget: float = 0.0, method: str = 'ols', scale_inputs: bool = True, fit_intercept: Union[bool, Dict] = True, beta_bounds: Dict[int, Tuple] = None, estimation_kwargs: Optional[Dict] = None, max_it_outer: int = 30, max_it_inner: int = 30, abs_tol_outer: float = 0.001, abs_tol_inner: float = 0.001, rel_tol_outer: float = 1e-05, rel_tol_inner: float = 1e-05, rss_tol_inner: float = 1.5)
Initialise the online GAMLSS Model
Parameters:
Name | Type | Description | Default |
---|---|---|---|
distribution
|
Distribution
|
The parametric distribution |
required |
equation
|
Dict
|
The modelling equation. Follows the schema |
required |
forget
|
float
|
The forget factor. Defaults to 0.0. |
0.0
|
method
|
str
|
The estimation method. Defaults to |
'ols'
|
scale_inputs
|
Optional[Dict]
|
Whether to scale the input matrices. Defaults to True |
True
|
beta_bounds
|
Dict[int, Tuple]
|
Dictionary of bounds for the different parameters. |
None
|
estimation_kwargs
|
Optional[Dict]
|
Dictionary of estimation method kwargs. Defaults to None. |
None
|
max_it_outer
|
int
|
Maximum outer iterations for the RS algorithm. Defaults to 30. |
30
|
max_it_inner
|
int
|
Maximum inner iterations for the RS algorithm. Defaults to 30. |
30
|
abs_tol_outer
|
float
|
Absolute tolerance on the Deviance in the outer fit. Defaults to 1e-3. |
0.001
|
abs_tol_inner
|
float
|
Absolute tolerance on the Deviance in the inner fit. Defaults to 1e-3. |
0.001
|
rel_tol_outer
|
float
|
Relative tolerance on the Deviance in the outer fit. Defaults to 1e-20. |
1e-05
|
rel_tol_inner
|
float
|
Relative tolerance on the Deviance in the inner fit. Defaults to 1e-20. |
1e-05
|
rss_tol_inner
|
float
|
Tolerance for increasing RSS in the inner fit. Defaults to 1.5. |
1.5
|
fit
Fit the online GAMLSS model.
Note
The user is only required to provide the design matrix \(X\) for the first distribution parameters. If for some distribution parameter no design matrix is provided, ROLCH
will model the parameter using an intercept.
Note
The provision of bounds for the coefficient vectors is only possible for LASSO/coordinate descent estimation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X
|
ndarray
|
Data Matrix. Currently supporting only numpy, will support pandas and polars in the future. |
required |
y
|
ndarray
|
Response variable \(Y\). |
required |
sample_weight
|
Optional[ndarray]
|
User-defined sample weights. Defaults to None. |
None
|
beta_bounds
|
Dict[int, Tuple]
|
Bounds for the \(eta\) in the coordinate descent algorithm. The user needs to provide a |
required |
predict
Predict the distibution parameters given input data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X
|
ndarray
|
Design matrix. |
required |
what
|
str
|
Predict the response or the link. Defaults to "response". Remember the GAMLSS models \(g(\theta) = X^T\beta\). Predict |
'response'
|
return_contributions
|
bool
|
Whether to return a |
False
|
Raises:
Type | Description |
---|---|
ValueError
|
Raises if |
Returns:
Type | Description |
---|---|
ndarray
|
np.ndarray: Predicted values for the distribution. |
update
Update the fit for the online GAMLSS Model.
Warning
Currently, the algorithm only takes single-step updates. Batch updates are planned for the first stable version.
Note
The beta_bounds
from the initial fit are still valid for the update.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X
|
ndarray
|
Data Matrix. Currently supporting only numpy, will support and pandas in the future. |
required |
y
|
ndarray
|
Response variable \(Y\). |
required |
sample_weight
|
Optional[ndarray]
|
User-defined sample weights. Defaults to None. |
None
|
rolch.OnlineLasso
Simple Online Lasso regression for the expected value.
__init__
__init__(forget: float = 0, ic: Literal['aic', 'bic', 'hqc', 'max'] = 'bic', scale_inputs: bool = True, intercept_in_design: bool = True, lambda_n: int = 100, lambda_eps: float = 0.0001, start_value: str = 'previous_fit', tolerance: float = 0.0001, max_iterations: int = 1000, selection: Literal['cyclic', 'random'] = 'cyclic')
Online LASSO estimator class.
This class initializes the online linear regression fitted using LASSO. The estimator object provides three main methods,
estimator.fit(X, y)
, estimator.update(X, y)
and estimator.predict(X)
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forget
|
float
|
Exponential discounting of old observations. Defaults to 0. |
0
|
ic
|
Literal['aic', 'bic', 'hqc', 'max']
|
The information criteria for model selection. Defaults to "bic". |
'bic'
|
scale_inputs
|
bool
|
Whether to scale the \(X\) matrix. Defaults to True. |
True
|
intercept_in_design
|
bool
|
Whether the first column of \(X\) corresponds to the intercept. In this case, the first beta will not be regularized. Defaults to True. |
True
|
lambda_n
|
int
|
Length of the regularization path. Defaults to 100. |
100
|
lambda_eps
|
float
|
The largest regularization is determined automatically such that the solution is fully regularized. The smallest regularization is taken as \(\varepsilon \lambda^\max\) and we will use an exponential grid. Defaults to 1e-4. |
0.0001
|
start_value
|
str
|
Whether to choose the previous fit or the previous regularization as start value. Defaults to 100. |
'previous_fit'
|
tolerance
|
float
|
Tolerance for breaking the CD. Defaults to 1e-4. |
0.0001
|
max_iterations
|
int
|
Max number of CD iterations. Defaults to 1000. |
1000
|
selection
|
Literal['cyclic', 'random']
|
Whether to cycle through all coordinates in order or random. For large problems, random might increase convergence. Defaults to 100. |
'cyclic'
|
fit
fit(X: np.ndarray, y: np.ndarray, sample_weight: Optional[np.ndarray] = None, beta_bounds: Optional[Tuple[np.ndarray, np.ndarray]] = None) -> None
Initial fit of the online LASSO.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X
|
ndarray
|
The design matrix \(X\). |
required |
y
|
ndarray
|
The response vector \(y\). |
required |
sample_weight
|
Optional[ndarray]
|
The sample weights. Defaults to None. |
None
|
beta_bounds
|
Optional[Tuple[ndarray, ndarray]]
|
Lower and upper bounds on the coefficient vector. |
None
|
predict
Predict using the optimal IC selection.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X
|
ndarray
|
The design matrix \(X\). |
required |
Returns:
Type | Description |
---|---|
ndarray
|
np.ndarray: The predictions for the optimal IC. |
predict_path
Predict the full regularization path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X
|
ndarray
|
The design matrix \(X\). |
required |
Returns:
Type | Description |
---|---|
ndarray
|
np.ndarray: The predictions for the full path. |
update
Update the regression model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X
|
ndarray
|
The new row of the design matrix \(X\). Needs to be of shape 1 x J. |
required |
y
|
ndarray
|
The new observation of \(y\). |
required |
sample_weight
|
Optional[ndarray]
|
The weight for the new observations. |
None
|