Skip to content

Estimators

Estimator classes provide an sklearn-like API to fit, predict and update models with the accordingly named methods.

Online GAMLSS

rolch.OnlineGamlss

Bases: Estimator

The online/incremental GAMLSS class.

__init__

__init__(distribution: Distribution, equation: Dict, forget: float | Dict[int, float] = 0.0, method: Union[str, EstimationMethod, Dict[int, str], Dict[int, EstimationMethod]] = 'ols', scale_inputs: bool | ndarray = True, fit_intercept: Union[bool, Dict[int, bool]] = True, regularize_intercept: Union[bool, Dict[int, bool]] = False, ic: Union[str, Dict] = 'aic', max_it_outer: int = 30, max_it_inner: int = 30, abs_tol_outer: float = 0.001, abs_tol_inner: float = 0.001, rel_tol_outer: float = 1e-05, rel_tol_inner: float = 1e-05, rss_tol_inner: float = 1.5, verbose: int = 0, debug: bool = False)

The OnlineGamlss() provides the fit, update and predict methods for linear parametric GAMLSS models.

For a response variable \(Y\) which is distributed according to the distribution \(\mathcal{F}(\theta)\) with the distribution parameters \(\theta\), we model:

\[g_k(\theta_k) = \eta_k = X_k\beta_k\]

where \(g_k(\cdot)\) is a link function, which ensures that the predicted distribution parameters are in a sensible range (we don't want, e.g. negative standard deviations), and \(\eta_k\) is the predictor (on the space of the link function). The model is fitted using iterative re-weighted least squares (IRLS).

Note

If you're facing issues with non-convergence and/or matrix inversion problems, please enable the debug mode and increase the logging level by increasing verbose. In debug mode, the estimator will save the weights, working vectors, derivatives each iteration in a according dictionary, i.e. self._debug_weights. The keys are composed of a tuple of ints of (parameter, outer_iteration, inner_iteration). Very small and/or very large weights (implicitly second derivatives) can be a sign that either start values are not chosen appropriately or that the distributional assumption does not fit the data well.

Warning

Please don't use debug more for production models since it saves the X matrix and its scaled counterpart, so you will get large estimator objects.

Parameters:

Name Type Description Default
distribution Distribution

The parametric distribution.

required
equation Dict

The modelling equation. Follows the schema {parameter[int]: column_identifier}, where column_identifier can be either the strings 'all', 'intercept' or a np.array of ints indicating the columns.

required
forget Union[float, Dict[int, float]]

The forget factor. Defaults to 0.0.

0.0
method Union[str, EstimationMethod, Dict[int, str], Dict[int, EstimationMethod]]

The estimation method. Defaults to "ols".

'ols'
scale_inputs bool

Whether to scale the input matrices. Defaults to True.

True
fit_intercept Union[bool, Dict[int, bool]]

Whether to fit an intercept. Defaults to True.

True
regularize_intercept Union[bool, Dict[int, bool]]

Whether to regularize the intercept. Defaults to False.

False
ic Union[str, Dict]

Information criterion for model selection. Defaults to "aic".

'aic'
max_it_outer int

Maximum outer iterations for the RS algorithm. Defaults to 30.

30
max_it_inner int

Maximum inner iterations for the RS algorithm. Defaults to 30.

30
abs_tol_outer float

Absolute tolerance on the deviance in the outer fit. Defaults to 1e-3.

0.001
abs_tol_inner float

Absolute tolerance on the deviance in the inner fit. Defaults to 1e-3.

0.001
rel_tol_outer float

Relative tolerance on the deviance in the outer fit. Defaults to 1e-5.

1e-05
rel_tol_inner float

Relative tolerance on the deviance in the inner fit. Defaults to 1e-5.

1e-05
rss_tol_inner float

Tolerance for increasing RSS in the inner fit. Defaults to 1.5.

1.5
verbose int

Verbosity level. Level 0 will print no messages. Level 1 will print messages according to the start and end of each fit / update call and on finished outer iterations. Level 2 will print messages on each parameter fit in each outer iteration. Level 3 will print messages on each inner iteration. Defaults to 0.

0
debug bool

Enable debug mode. Debug mode will save additional data to the estimator object. Currently, we save

* self._debug_X_dict
* self._debug_X_scaled
* self._debug_weights
* self._debug_working_vectors
* self._debug_dl1dlp1
* self._debug_dl2dlp2
* self._debug_eta

to the the estimator. Debug mode works in batch and online settings. Note that debug mode is not recommended for production use. Defaults to False.

False

_add_lags staticmethod

_add_lags(y: ndarray, x: ndarray, lags: Union[int, ndarray]) -> Tuple[np.ndarray, np.ndarray]

Add lagged variables to the response and covariate matrices.

Parameters:

Name Type Description Default
y ndarray

Response variable.

required
x ndarray

Covariate matrix.

required
lags Union[int, ndarray]

Number of lags to add.

required

Returns:

Type Description
Tuple[ndarray, ndarray]

Tuple[np.ndarray, np.ndarray]: Tuple containing the updated response and covariate matrices.

_is_intercept_only

_is_intercept_only(param: int)

Check in the equation whether we model only as intercept

_make_intercept staticmethod

_make_intercept(n_observations: int) -> np.ndarray

Make the intercept series as N x 1 array.

Parameters:

Name Type Description Default
y ndarray

Response variable \(Y\)

required

Returns:

Type Description
ndarray

np.ndarray: Intercept array.

_process_equation

_process_equation(equation: Dict)

Preprocess the equation object and validate inputs.

fit

fit(X: ndarray, y: ndarray, sample_weight: Optional[ndarray] = None)

Fit the online GAMLSS model.

Note

The user is only required to provide the design matrix \(X\) for the first distribution parameters. If for some distribution parameter no design matrix is provided, ROLCH will model the parameter using an intercept.

Note

The provision of bounds for the coefficient vectors is only possible for LASSO/coordinate descent estimation.

Parameters:

Name Type Description Default
X ndarray

Data Matrix. Currently supporting only numpy, will support pandas and polars in the future.

required
y ndarray

Response variable \(Y\).

required
sample_weight Optional[ndarray]

User-defined sample weights. Defaults to None.

None
beta_bounds Dict[int, Tuple]

Bounds for the \(eta\) in the coordinate descent algorithm. The user needs to provide a dict with a mapping of tuples to distribution parameters 0, 1, 2, and 3 potentially. Defaults to None.

required

predict

predict(X: ndarray, what: str = 'response', return_contributions: bool = False) -> np.ndarray

Predict the distibution parameters given input data.

Parameters:

Name Type Description Default
X ndarray

Design matrix.

required
what str

Predict the response or the link. Defaults to "response". Remember the GAMLSS models \(g(\theta) = X^T\beta\). Predict "link" will output \(X^T\beta\), predict "response" will output \(g^{-1}(X^T\beta)\). Usually, you want predict = "response".

'response'
return_contributions bool

Whether to return a Tuple[prediction, contributions] where the contributions of the individual covariates for each distribution parameter's predicted value is specified. Defaults to False.

False

Raises:

Type Description
ValueError

Raises if what is not in ["link", "response"].

Returns:

Type Description
ndarray

np.ndarray: Predicted values for the distribution.

update

update(X: ndarray, y: ndarray, sample_weight: Optional[ndarray] = None)

Update the fit for the online GAMLSS Model.

Warning

Currently, the algorithm only takes single-step updates. Batch updates are planned for the first stable version.

Note

The beta_bounds from the initial fit are still valid for the update.

Parameters:

Name Type Description Default
X ndarray

Data Matrix. Currently supporting only numpy, will support and pandas in the future.

required
y ndarray

Response variable \(Y\).

required
sample_weight Optional[ndarray]

User-defined sample weights. Defaults to None.

None

Linear Models

rolch.OnlineLinearModel

Bases: Estimator

Simple Online Linear Regression for the expected value.

__init__

__init__(forget: float = 0, scale_inputs: bool | ndarray = True, fit_intercept: bool = True, regularize_intercept: bool = False, method: EstimationMethod | str = 'ols', ic: Literal['aic', 'bic', 'hqc', 'max'] = 'bic')

The basic linear model for many different estimation techniques.

Parameters:

Name Type Description Default
forget float

Exponential discounting of old observations. Defaults to 0.

0
scale_inputs bool

Whether to scale the \(X\) matrix. Defaults to True.

True
fit_intercept bool

Whether to add an intercept in the estimation. Defaults to True.

True
regularize_intercept bool

Whether to regularize the intercept. Defaults to False.

False
method EstimationMethod | str

The estimation method. Can be a string or EstimationMethod class. Defaults to "ols".

'ols'
ic Literal['aic', 'bic', 'hqc', 'max']

The information criteria for model selection. Defaults to "bic".

'bic'

Raises: ValueError: Will raise if you try to regularize the intercept, but not fit it.

fit

fit(X: ndarray, y: ndarray, sample_weight: Optional[ndarray] = None) -> None

Initial fit of the online regression model.

Parameters:

Name Type Description Default
X ndarray

The design matrix \(X\).

required
y ndarray

The response vector \(y\).

required
sample_weight Optional[ndarray]

The sample weights. Defaults to None.

None

predict

predict(X: ndarray) -> np.ndarray

Predict using the optimal IC selection.

Parameters:

Name Type Description Default
X ndarray

The design matrix \(X\).

required

Returns:

Type Description
ndarray

np.ndarray: The predictions for the optimal IC.

predict_path

predict_path(X: ndarray) -> np.ndarray

Predict the full regularization path.

Parameters:

Name Type Description Default
X ndarray

The design matrix \(X\).

required

Returns:

Type Description
ndarray

np.ndarray: The predictions for the full path.

update

update(X: ndarray, y: ndarray, sample_weight: Optional[ndarray] = None) -> None

Update the regression model.

Parameters:

Name Type Description Default
X ndarray

The new row of the design matrix \(X\). Needs to be of shape 1 x J.

required
y ndarray

The new observation of \(y\).

required
sample_weight Optional[ndarray]

The weight for the new observations. None implies all observations have weight 1. Defaults to None.

None

rolch.OnlineLasso

Bases: OnlineLinearModel

__init__

__init__(forget: float = 0, scale_inputs: bool = True, fit_intercept: bool = True, regularize_intercept: bool = False, ic: Literal['aic', 'bic', 'hqc', 'max'] = 'bic', beta_lower_bound: ndarray | None = None, beta_upper_bound: ndarray | None = None, lambda_n: int = 100, lambda_eps: float = 0.0001, start_value: str = 'previous_fit', tolerance: float = 0.0001, max_iterations: int = 1000, selection: Literal['cyclic', 'random'] = 'cyclic')

Online LASSO estimator class.

This class initializes the online linear regression fitted using LASSO. The estimator object provides three main methods, estimator.fit(X, y), estimator.update(X, y) and estimator.predict(X).

Parameters:

Name Type Description Default
forget float

Exponential discounting of old observations. Defaults to 0.

0
scale_inputs bool

Whether to scale the \(X\) matrix. Defaults to True.

True
fit_intercept bool

Whether to add an intercept in the estimation. Defaults to True.

True
regularize_intercept bool

Whether to regularize the intercept. Defaults to False.

False
ic Literal['aic', 'bic', 'hqc', 'max']

The information criteria for model selection. Defaults to "bic".

'bic'
beta_lower_bound ndarray | None

Lower bounds for beta. Keep in mind the size of X and whether you want to fit an intercept. None corresponds to unconstrained estimation.Defaults to None.

None
beta_upper_bound ndarray | None

Lower bounds for beta. Keep in mind the size of X and whether you want to fit an intercept. None corresponds to unconstrained estimation. Defaults to None.

None
lambda_n int

Length of the regularization path. Defaults to 100.

100
lambda_eps float

The largest regularization is determined automatically such that the solution is fully regularized. The smallest regularization is taken as \(\varepsilon \lambda^\max\) and we will use an exponential grid. Defaults to 1e-4.

0.0001
start_value str

Whether to choose the previous fit or the previous regularization as start value. Defaults to 100.

'previous_fit'
tolerance float

Tolerance for breaking the CD. Defaults to 1e-4.

0.0001
max_iterations int

Max number of CD iterations. Defaults to 1000.

1000
selection Literal['cyclic', 'random']

Whether to cycle through all coordinates in order or random. For large problems, random might increase convergence. Defaults to 100.

'cyclic'