Reference#

Package `remodels`#

The remodels package provides a set of tools and modules for quantum risk assessment. This section outlines the main components and functionalities of the remodels package.

Data Module#

class remodels.data.EntsoeApi.EntsoeApi(security_token)#

EntsoeApi class provides an interface to interact with the ENTSO-E Transparency Platform API, offering access to data about the European electricity market.

This class simplifies the process of fetching and handling data from the ENTSO-E Transparency Platform API, making it more accessible and easier to integrate into various energy analysis projects. The class requires a security token for authentication and provides methods to query different types of data offered by the ENTSO-E Transparency Platform API, such as electricity prices and load data.

Initializes the EntsoeApi class with the provided security token.

Parameters:: security_token (str) – The security token for accessing the API.

get_day_ahead_pricing(start_date, end_date, in_domain, resolution_preference=None)#

Retrieves day-ahead pricing data.

Retrieves day-ahead pricing data from API for a given domain and date range.

Parameters:

start_date (datetime) – The start date for the data retrieval.
end_date (datetime) – The end date for the data retrieval.
in_domain (str) – The market domain for which to retrieve pricing data.
resolution_preference (int, optional) – The resolution in minutes for the pricing data (optional).

Returns:

A DataFrame containing day-ahead pricing data.

Return type:

pd.DataFrame

get_forecast_load(start_date, end_date, out_domain)#

Retrieves forecasted load data.

Retrieves forecasted load data from the API for a given domain and date range.

Parameters:

start_date (datetime) – The start date for the data retrieval.
end_date (datetime) – The end date for the data retrieval.
out_domain (str) – The market domain for which to retrieve load forecast data.

Returns:

A DataFrame containing forecasted load data.

Return type:

pd.DataFrame

Transformers Module#

class remodels.transformers.BaseScaler.BaseScaler#

Custom scaler base class following scikit-learn’s conventions.

fit(X, y=None)#

Fit the scaler to the data. Placeholder that does nothing.

Parameters:

X (array-like) – Input data.
y (array-like, optional) – Optional, target values (None by default).

Returns:

Returns self.

Return type:

BaseScaler

fit_transform(X, y=None)#

Fit to data, then transform it.

Parameters:

X (np.ndarray) – Features to fit and transform.
y (np.ndarray, optional) – Optional target to fit and transform.

Returns:

Transformed features and optionally transformed target.

Return type:

tuple or np.ndarray

inverse_transform(X, y=None)#

Inverse transformat the data. Placeholder that should be overridden by subclasses.

Parameters:

X (array-like) – Input data to transform.
y (array-like, optional) – Optional, target values (None by default).

Returns:

Transformed data.

Return type:

array-like

transform(X, y=None)#

Transforms the data. Placeholder that should be overridden by subclasses.

Parameters:

X (array-like) – Input data to transform.
y (array-like, optional) – Optional, target values (None by default).

Returns:

Transformed data.

Return type:

pd.DataFrame or Tuple[pd.DataFrame, pd.DataFrame]

class remodels.transformers.StandardizingScaler.StandardizingScaler(method='median')#

A custom scaler for standardizing data using either the median or mean method.

This scaler is suitable for preprocessing datasets in preparation for machine learning models. It standardizes the data, bringing it to a common scale without distorting differences in the range of values. The scaler can operate using either the median or mean to calculate the center and scale of the data.

Initialize the StandardizingScaler with the chosen method of centering and scaling.

Parameters:: method (str) – Method to use for centering (‘median’ or ‘mean’).

fit(X, y=None)#

Fit the scaler to the features X and optionally to the target y.

Parameters:

X (pd.DataFrame) – Features to fit.
y (pd.DataFrame, optional) – Optional target to fit.

Returns:

The fitted scaler.

Return type:

StandardizingScaler

fit_transform(X, y=None)#

Fit to data, then transform it.

Parameters:

X (np.ndarray) – Features to fit and transform.
y (np.ndarray, optional) – Optional target to fit and transform.

Returns:

Transformed features and optionally transformed target.

Return type:

tuple or np.ndarray

inverse_transform(X, y=None)#

Apply the inverse transformation to the features X and optionally the target y.

Parameters:

X (array-like, optional) – Transformed features to inverse transform.
y (array-like, optional) – Transformed target to inverse transform.

Returns:

The original features and target.

Return type:

tuple

transform(X, y=None)#

Transform the features X and optionally the target y using the fitted scaler.

Parameters:

X (pd.DataFrame) – Features to transform.
y (pd.DataFrame, optional) – Optional target to transform.

Returns:

The transformed features and optionally the transformed target.

Return type:

tuple

class remodels.transformers.DSTAdjuster.DSTAdjuster#

A transformer for adjusting time series data to account for Daylight Saving Time (DST) changes.

This class provides functionality to modify time series data by removing timezone information and resampling to an hourly frequency. It’s designed to handle potential issues arising from DST transitions, such as duplicate or missing timestamps. The transformer can be used with any time series data that includes timezone-aware datetime indices.

Initialize the DSTAdjuster.

fit(X, y=None)#

Fit the transformer to the data.

This transformer does not learn anything from the data and hence the fit method is a placeholder that returns self.

Parameters:

X (pd.DataFrame) – Features to fit.
y (pd.Series, optional) – Optional target to fit. Not used in this transformer.

Returns:

The fitted transformer.

Return type:

DSTAdjuster

fit_transform(X, y=None)#

Fit to data, then transform it.

Parameters:

X (np.ndarray) – Features to fit and transform.
y (np.ndarray, optional) – Optional target to fit and transform.

Returns:

Transformed features and optionally transformed target.

Return type:

tuple or np.ndarray

inverse_transform(X, y=None)#

Inverse transformat the data. Placeholder that should be overridden by subclasses.

Parameters:

X (array-like) – Input data to transform.
y (array-like, optional) – Optional, target values (None by default).

Returns:

Transformed data.

Return type:

array-like

transform(X, y=None)#

Transform the time series data to adjust for DST changes.

Parameters:

X (pd.DataFrame) – Time series data to trasnsform.
y (pd.Series, optional) – Optional target series corresponding to the time series data.

Returns:

Adjusted time series data, and optionally the target series.

Return type:

pd.DataFrame, pd.Series

VSTransformers SubModule#

class remodels.transformers.VSTransformers.ArcsinhScaler.ArcsinhScaler#

A scaler that applies an arcsinh (inverse hyperbolic sine) transformation to the data.

This scaler is useful for handling data with skewed distributions and can help in stabilizing the variance of the data. The transformation is stateless and does not depend on the data itself, meaning no fitting is required.

The scaler also provides an inverse transformation function, which applies the sinh (hyperbolic sine) transformation to revert the data back to its original scale.

fit(X, y=None)#

Fit the scaler to the data. Placeholder that does nothing.

Parameters:

X (array-like) – Input data.
y (array-like, optional) – Optional, target values (None by default).

Returns:

Returns self.

Return type:

BaseScaler

fit_transform(X, y=None)#

Fit to data, then transform it.

Parameters:

X (np.ndarray) – Features to fit and transform.
y (np.ndarray, optional) – Optional target to fit and transform.

Returns:

Transformed features and optionally transformed target.

Return type:

tuple or np.ndarray

inverse_transform(X, y=None)#

Apply the inverse arcsinh transformation to the features and optionally the target.

Parameters:

X (pd.DataFrame) – Transformed features to inverse transform.
y (pd.DataFrame) – Transformed target to inverse transform.

Returns:

Inverse transformed features and optionally inverse transformed target.

Type:

Tuple[pd.DataFrame, pd.DataFrame]

Return type:

Tuple[DataFrame, DataFrame]

transform(X, y=None)#

Apply the arcsinh transformation to the features and optionally the target.

Parameters:

X (pd.DataFrame) – Features to transform.
y (pd.DataFrame) – Optional target to transform.

Returns:

Transformed features and optionally transformed target.

Return type:

pd.DataFrame or Tuple[pd.DataFrame, pd.DataFrame]

class remodels.transformers.VSTransformers.BoxCoxScaler.BoxCoxScaler(lamb=0.5)#

A scaler that applies a Box-Cox transformation to the data.

The Box-Cox transformation is a statistical technique used to stabilize variance, make the data more normally distributed, and improve the validity of measures of association. It’s particularly effective for transforming non-normal dependent variables into a normal shape. The transformation is defined as:

Y(λ) = (X^λ - 1) / λ, if λ != 0: log(X), if λ = 0

where X is the original data and λ is the transformation parameter. The λ value is chosen to maximize the normality of the transformed data. A λ of 0 implies a log transformation, while other values indicate various degrees of exponential transformation.

This scaler includes both the Box-Cox transformation and its inverse, enabling reversible scaling of data.

Initialize the scaler with a lambda parameter for the Box-Cox transformation.

Parameters:: lamb (float) – Lambda parameter for the Box-Cox transformation.

fit(X, y=None)#

Fit the scaler to the data. Placeholder that does nothing.

Parameters:

X (array-like) – Input data.
y (array-like, optional) – Optional, target values (None by default).

Returns:

Returns self.

Return type:

BaseScaler

fit_transform(X, y=None)#

Fit to data, then transform it.

Parameters:

X (np.ndarray) – Features to fit and transform.
y (np.ndarray, optional) – Optional target to fit and transform.

Returns:

Transformed features and optionally transformed target.

Return type:

tuple or np.ndarray

inverse_transform(X, y=None)#

Apply the inverse Box-Cox transformation to the features and optionally the target.

Parameters:

X (pd.DataFrame) – Transformed features to inverse transform.
y (pd.DataFrame) – Transformed target to inverse transform.

Returns:

Inverse transformed features and optionally inverse transformed target.

Return type:

pd.DataFrame or Tuple[pd.DataFrame, pd.DataFrame]

transform(X, y=None)#

Apply the transformation to the features and optionally the target.

Parameters:

X (pd.DataFrame) – Features to transform.
y (pd.DataFrame) – Optional target to transform.

Returns:

Transformed features and optionally transformed target.

Return type:

pd.DataFrame or Tuple[pd.DataFrame, pd.DataFrame]

class remodels.transformers.VSTransformers.ClippingScaler.ClippingScaler(k=3)#

Scaler that clips feature and target values to within a specified number of standard deviations from the mean.

This scaler limits extreme values in the data by clipping them to a defined range based on a multiple of standard deviations. It is particularly useful for mitigating the effect of outliers in the data, making it more robust for various statistical analyses or machine learning models.

The scaler also provides an inverse transformation function to revert the data back to its original scale.

Initialize the ClippingScaler with a clipping threshold.

Parameters:: k (int or float) – The number of standard deviations to use as the clipping threshold.

fit(X, y=None)#

Fit the scaler to the data.

This scaler does not learn anything from the data and hence the fit method is a placeholder that returns self.

Parameters:

X (np.ndarray or pd.DataFrame) – Features to fit.
y (np.ndarray or pd.Series, optional) – Optional target to fit. Not used in this scaler.

Returns:

The fitted scaler.

Return type:

ClippingScaler

fit_transform(X, y=None)#

Fit to data, then transform it.

Parameters:

X (np.ndarray) – Features to fit and transform.
y (np.ndarray, optional) – Optional target to fit and transform.

Returns:

Transformed features and optionally transformed target.

Return type:

tuple or np.ndarray

inverse_transform(X=None, y=None)#

Inverse transform the features and optionally the target by unclipping their values.

This method assumes the original data was within the range [-k, k].

Parameters:

X (np.ndarray or pd.DataFrame) – Transformed features to inverse transform.
y (np.ndarray or pd.Series, optional) – Transformed target to inverse transform.

Returns:

The original features and target.

Return type:

tuple

transform(X, y=None)#

Transform the features and optionally the target by clipping their values.

Parameters:

X (np.ndarray or pd.DataFrame) – Features to transform.
y (np.ndarray or pd.Series, optional) – Optional target to transform.

Returns:

The transformed features and optionally the transformed target.

Return type:

tuple

class remodels.transformers.VSTransformers.LogClippingScaler.LogClippingScaler(k=3)#

Scaler that applies a logarithmic transformation to values exceeding a specified threshold.

This scaler is designed to transform features by applying a logarithmic transformation, but only to values that exceed a certain threshold, ‘k’. This approach can be particularly useful in reducing the impact of outliers or extreme values in the data, while maintaining the scale of the rest of the data.

The scaler also provides an inverse transformation function to revert the data back to its original scale.

Initialize the scaler with a clipping threshold.

Parameters:: k (float) – Clipping threshold.

fit(X, y=None)#

Fit the scaler to the data. Placeholder that does nothing.

Parameters:

X (array-like) – Input data.
y (array-like, optional) – Optional, target values (None by default).

Returns:

Returns self.

Return type:

BaseScaler

fit_transform(X, y=None)#

Fit to data, then transform it.

Parameters:

X (np.ndarray) – Features to fit and transform.
y (np.ndarray, optional) – Optional target to fit and transform.

Returns:

Transformed features and optionally transformed target.

Return type:

tuple or np.ndarray

inverse_transform(X=None, y=None)#

Apply the inverse log clipping transformation to the features and optionally the target.

Parameters:

X (pd.DataFrame) – Transformed features to inverse transform.
y (pd.DataFrame, optional) – Transformed target to inverse transform.

Returns:

Original features and optionally original target.

Return type:

Tuple[pd.DataFrame, pd.DataFrame]

transform(X, y=None)#

Apply the transformation to the features and optionally the target.

Parameters:

X (pd.DataFrame) – Features to transform.
y (pd.DataFrame, optional) – Optional target to transform.

Returns:

Transformed features and optionally transformed target.

Return type:

pd.DataFrame or Tuple[pd.DataFrame, pd.DataFrame]

class remodels.transformers.VSTransformers.LogisticScaler.LogisticScaler#

Scaler that applies a logistic transformation to the data. This transformation converts each feature using the logistic function, which maps any real-valued number into the range (0, 1).

The transformation is particularly useful in preparing data for algorithms that expect input values to be in a bounded range. It can also help in dealing with features that have skewed distributions.

The logistic transformation is defined as:: 1 / (1 + exp(-x))

where ‘x’ is the feature value.

The scaler provides both the transformation and its inverse, allowing the original scale of the data to be recovered.

The scaler also provides an inverse transformation function to revert the data back to its original scale.

fit(X, y=None)#

Fit the scaler to the data. Placeholder that does nothing.

Parameters:

X (array-like) – Input data.
y (array-like, optional) – Optional, target values (None by default).

Returns:

Returns self.

Return type:

BaseScaler

fit_transform(X, y=None)#

Fit to data, then transform it.

Parameters:

X (np.ndarray) – Features to fit and transform.
y (np.ndarray, optional) – Optional target to fit and transform.

Returns:

Transformed features and optionally transformed target.

Return type:

tuple or np.ndarray

inverse_transform(X=None, y=None)#

Apply the inverse logistic transformation to the features and optionally the target.

Parameters:

X (pd.DataFrame) – Transformed features to inverse transform.
y (pd.DataFrame, optional) – Transformed target to inverse transform.

Returns:

Original features and optionally original target.

Return type:

Tuple[pd.DataFrame, pd.DataFrame]

transform(X, y=None)#

Apply the logistic transformation to the features and optionally the target.

Parameters:

X (pd.DataFrame) – Features to transform.
y (pd.DataFrame, optional) – Optional target to transform.

Returns:

Transformed features and optionally transformed target.

Return type:

pd.DataFrame or Tuple[pd.DataFrame, pd.DataFrame]

class remodels.transformers.VSTransformers.MLogScaler.MLogScaler(c=0.3333333333333333)#

Scaler that applies a modified logarithmic transformation to the data. This transformation is designed to handle zero and negative values effectively by incorporating a small constant.

The transformation is defined as:: sign(x) * (log(|x| + 1/c) + log(c))

where ‘c’ is a small constant to ensure non-zero division. This transformation helps in stabilizing variance and normalizing distributions, especially useful for skewed data.

The scaler also provides an inverse transformation function to revert the data back to its original scale.

Initialize the scaler with a constant used in the transformation.

Parameters:: c (float) – A small constant to ensure non-zero division in transformation.

fit(X, y=None)#

Fit the scaler to the data. Placeholder that does nothing.

Parameters:

X (array-like) – Input data.
y (array-like, optional) – Optional, target values (None by default).

Returns:

Returns self.

Return type:

BaseScaler

fit_transform(X, y=None)#

Fit to data, then transform it.

Parameters:

X (np.ndarray) – Features to fit and transform.
y (np.ndarray, optional) – Optional target to fit and transform.

Returns:

Transformed features and optionally transformed target.

Return type:

tuple or np.ndarray

inverse_transform(X=None, y=None)#

Inverse transform the features and optionally the target.

Parameters:

X (pd.DataFrame) – Transformed features to inverse transform.
y (pd.DataFrame, optional) – Transformed target to inverse transform.

Returns:

Original features and optionally original target.

Return type:

Tuple[pd.DataFrame, pd.DataFrame]

transform(X, y=None)#

Transform the features and optionally the target.

Parameters:

X (pd.DataFrame) – Features to transform.
y (pd.DataFrame, optional) – Optional target to transform.

Returns:

Transformed features and optionally transformed target.

Return type:

pd.DataFrame or Tuple[pd.DataFrame, pd.DataFrame]

class remodels.transformers.VSTransformers.PITScaler.PITScaler(distribution='normal', nu=8)#

Probability Integral Transform (PIT) Scaler applies a transformation to data based on a specified probability distribution.

This scaler transforms each feature using the cumulative distribution function (CDF) of the specified distribution, effectively mapping the empirical CDF of the data to the target distribution. This technique is often used in statistical modeling and forecasting to normalize data or make it conform to a certain distribution.

The scaler also provides an inverse transformation function to revert the data back to its original scale.

Initialize the PIT-Scaler.

Parameters:

distribution (str, optional) – distribution, defaults to “normal”
nu (int, optional) – distribution parameter, defaults to 8

fit(X, y=None)#

Fit the scaler to the data.

Parameters:

X (pd.DataFrame) – Input data.
y (pd.DataFrame, optional) – Optional, target values (None by default).

Returns:

Returns self.

Return type:

PITScaler

fit_transform(X, y=None)#

Fit to data, then transform it.

Parameters:

X (np.ndarray) – Features to fit and transform.
y (np.ndarray, optional) – Optional target to fit and transform.

Returns:

Transformed features and optionally transformed target.

Return type:

tuple or np.ndarray

inverse_transform(X=None, y=None)#

Inverse transform the features and optionally the target.

Parameters:

X (pd.DataFrame) – Transformed features to inverse transform.
y (pd.DataFrame, optional) – Transformed target to inverse transform.

Returns:

Original features and optionally original target.

Return type:

Tuple[pd.DataFrame, pd.DataFrame]

transform(X, y=None)#

Transforms the data.

Parameters:

X (pd.DataFrame) – Input data to transform.
y (pd.DataFrame, optional) – Optional, target values (None by default).

Returns:

Transformed data.

Return type:

pd.DataFrame or Tuple[pd.DataFrame, pd.DataFrame]

class remodels.transformers.VSTransformers.PolyScaler.PolyScaler(lamb=0.125, c=0.05)#

Scaler that applies a polynomial transformation to data, transforming each feature according to a polynomial function.

The transformation is defined as:

sign(x) * ((|x| + (c / lamb)^(1 / (lamb - 1)))^lamb - (c / lamb)^(lamb / (lamb - 1)))

where ‘lamb’ is the exponent parameter, and ‘c’ is a constant determining the curvature of the polynomial. This transformation can be particularly useful for stabilizing variance and making skewed distributions more symmetric.

The scaler also provides an inverse transformation function to revert the data back to its original scale.

Initialize the scaler with parameters for the polynomial transformation.

Parameters:

lamb (float) – Exponent used in the polynomial transformation.
c (float) – Constant that defines the curvature of the polynomial transformation.

fit(X, y=None)#

Fit the scaler to the data. Placeholder that does nothing.

Parameters:

X (array-like) – Input data.
y (array-like, optional) – Optional, target values (None by default).

Returns:

Returns self.

Return type:

BaseScaler

fit_transform(X, y=None)#

Fit to data, then transform it.

Parameters:

X (np.ndarray) – Features to fit and transform.
y (np.ndarray, optional) – Optional target to fit and transform.

Returns:

Transformed features and optionally transformed target.

Return type:

tuple or np.ndarray

inverse_transform(X, y=None)#

Inverse transform the features and optionally the target.

Parameters:

X (np.ndarray) – Transformed features to inverse transform.
y (np.ndarray, optional) – Transformed target to inverse transform.

Returns:

Original features and optionally original target.

Return type:

Tuple[pd.DataFrame, pd.DataFrame]

transform(X, y=None)#

Transform the features and optionally the target.

Parameters:

X (pd.DataFrame) – Features to transform.
y (pd.DataFrame, optional) – Optional target to transform.

Returns:

Transformed features and optionally transformed target.

Return type:

pd.DataFrame or Tuple[pd.DataFrame, pd.DataFrame]

Pipeline Module#

class remodels.pipelines.RePipeline.RePipeline(steps, *, memory=None, verbose=False)#

Custom implementation of the scikit-learn Pipeline class for additional functionality.

This class extends the standard scikit-learn Pipeline by adding specialized handling of steps that involve both features and target data, as well as inverse transformations.

Parameters:: steps (List[Any]) –

property classes_#: The classes labels. Only exist if the last step is a classifier.

decision_function(X, **params)#

Transform the data, and apply decision_function with the final estimator.

Call transform of each transformer in the pipeline. The transformed data are finally passed to the final estimator that calls decision_function method. Only valid if the final estimator implements decision_function.

Parameters:

X (iterable) – Data to predict on. Must fulfill input requirements of first step of the pipeline.
**params (dict of string -> object) –
Parameters requested and accepted by steps. Each step must have requested certain metadata for these parameters to be forwarded to them.

New in version 1.4: Only available if enable_metadata_routing=True. See Metadata Routing User Guide for more details.

Returns:

y_score – Result of calling decision_function on the final estimator.

Return type:

ndarray of shape (n_samples, n_classes)

property feature_names_in_#: Names of features seen during first step fit method.

fit(X, y=None, **fit_params)#

Fit the pipeline with the input and target data.

Parameters:

X (pd.DataFrame) – Input data to fit.
y (pd.DataFrame) – Target values.
fit_params – Additional fitting parameters.

Returns:

The fitted pipeline.

Return type:

RePipeline

fit_predict(X, y=None, **params)#

Transform the data, and apply fit_predict with the final estimator.

Call fit_transform of each transformer in the pipeline. The transformed data are finally passed to the final estimator that calls fit_predict method. Only valid if the final estimator implements fit_predict.

Parameters:

X (iterable) – Training data. Must fulfill input requirements of first step of the pipeline.
y (iterable, default=None) – Training targets. Must fulfill label requirements for all steps of the pipeline.
**params (dict of str -> object) –
- If enable_metadata_routing=False (default):
  
  Parameters to the predict called at the end of all transformations in the pipeline.
- If enable_metadata_routing=True:
  
  Parameters requested and accepted by steps. Each step must have requested certain metadata for these parameters to be forwarded to them.
New in version 0.20.

Changed in version 1.4: Parameters are now passed to the transform method of the intermediate steps as well, if requested, and if enable_metadata_routing=True.

See Metadata Routing User Guide for more details.

Note that while this may be used to return uncertainties from some models with return_std or return_cov, uncertainties that are generated by the transformations in the pipeline are not propagated to the final estimator.

Returns:

y_pred – Result of calling fit_predict on the final estimator.

Return type:

ndarray

fit_transform(X, y=None, **fit_params)#

Fit the pipeline and transform the data.

Parameters:

X (pd.DataFrame) – Input data to fit.
y (pd.DataFrame) – Target values.
fit_params (list) – Additional fitting parameters.

Returns:

The transformed feature data, and optionally target data.

Return type:

pd.DataFrame or Tuple[pd.Dataframe, pd.DataFrame]

get_feature_names_out(input_features=None)#

Get output feature names for transformation.

Transform input features using the pipeline.

Parameters:: input_features (array-like of str or None, default=None) – Input features.
Returns:: feature_names_out – Transformed feature names.
Return type:: ndarray of str objects

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:: routing – A MetadataRouter encapsulating routing information.
Return type:: MetadataRouter

get_params(deep=True)#

Get parameters for this estimator.

Returns the parameters given in the constructor as well as the estimators contained within the steps of the Pipeline.

Parameters:: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:: params – Parameter names mapped to their values.
Return type:: mapping of string to any

inverse_transform(Xt=None, yt=None)#

Apply inverse transformations in reverse order of the data.

Parameters:

Xt (pd.DataFrame) – Transformed feature data to inverse transform.
yt (pd.DataFrame) – Transformed target values.

Returns:

Original feature data and target values.

Return type:

Tuple[pd.Dataframe, pd.DataFrame]

property n_features_in_#: Number of features seen during first step fit method.

property named_steps#

Access the steps by name.

Read-only attribute to access any step by given name. Keys are steps names and values are the steps objects.

predict(X, **params)#

Transform the data, and apply predict with the final estimator.

Call transform of each transformer in the pipeline. The transformed data are finally passed to the final estimator that calls predict method. Only valid if the final estimator implements predict.

Parameters:

X (iterable) – Data to predict on. Must fulfill input requirements of first step of the pipeline.
**params (dict of str -> object) –
- If enable_metadata_routing=False (default):
  
  Parameters to the predict called at the end of all transformations in the pipeline.
- If enable_metadata_routing=True:
  
  Parameters requested and accepted by steps. Each step must have requested certain metadata for these parameters to be forwarded to them.
New in version 0.20.

Changed in version 1.4: Parameters are now passed to the transform method of the intermediate steps as well, if requested, and if enable_metadata_routing=True is set via set_config().

See Metadata Routing User Guide for more details.

Note that while this may be used to return uncertainties from some models with return_std or return_cov, uncertainties that are generated by the transformations in the pipeline are not propagated to the final estimator.

Returns:

y_pred – Result of calling predict on the final estimator.

Return type:

ndarray

predict_log_proba(X, **params)#

Transform the data, and apply predict_log_proba with the final estimator.

Call transform of each transformer in the pipeline. The transformed data are finally passed to the final estimator that calls predict_log_proba method. Only valid if the final estimator implements predict_log_proba.

Parameters:

X (iterable) – Data to predict on. Must fulfill input requirements of first step of the pipeline.
**params (dict of str -> object) –
- If enable_metadata_routing=False (default):
  
  Parameters to the predict_log_proba called at the end of all transformations in the pipeline.
- If enable_metadata_routing=True:
  
  Parameters requested and accepted by steps. Each step must have requested certain metadata for these parameters to be forwarded to them.
New in version 0.20.

Changed in version 1.4: Parameters are now passed to the transform method of the intermediate steps as well, if requested, and if enable_metadata_routing=True.

See Metadata Routing User Guide for more details.

Returns:

y_log_proba – Result of calling predict_log_proba on the final estimator.

Return type:

ndarray of shape (n_samples, n_classes)

predict_proba(X, **params)#

Transform the data, and apply predict_proba with the final estimator.

Call transform of each transformer in the pipeline. The transformed data are finally passed to the final estimator that calls predict_proba method. Only valid if the final estimator implements predict_proba.

Parameters:

X (iterable) – Data to predict on. Must fulfill input requirements of first step of the pipeline.
**params (dict of str -> object) –
- If enable_metadata_routing=False (default):
  
  Parameters to the predict_proba called at the end of all transformations in the pipeline.
- If enable_metadata_routing=True:
  
  Parameters requested and accepted by steps. Each step must have requested certain metadata for these parameters to be forwarded to them.
New in version 0.20.

Changed in version 1.4: Parameters are now passed to the transform method of the intermediate steps as well, if requested, and if enable_metadata_routing=True.

See Metadata Routing User Guide for more details.

Returns:

y_proba – Result of calling predict_proba on the final estimator.

Return type:

ndarray of shape (n_samples, n_classes)

score(X, y=None, sample_weight=None, **params)#

Transform the data, and apply score with the final estimator.

Call transform of each transformer in the pipeline. The transformed data are finally passed to the final estimator that calls score method. Only valid if the final estimator implements score.

Parameters:

X (iterable) – Data to predict on. Must fulfill input requirements of first step of the pipeline.
y (iterable, default=None) – Targets used for scoring. Must fulfill label requirements for all steps of the pipeline.
sample_weight (array-like, default=None) – If not None, this argument is passed as sample_weight keyword argument to the score method of the final estimator.
**params (dict of str -> object) –
Parameters requested and accepted by steps. Each step must have requested certain metadata for these parameters to be forwarded to them.

New in version 1.4: Only available if enable_metadata_routing=True. See Metadata Routing User Guide for more details.

Returns:

score – Result of calling score on the final estimator.

Return type:

float

score_samples(X)#

Transform the data, and apply score_samples with the final estimator.

Call transform of each transformer in the pipeline. The transformed data are finally passed to the final estimator that calls score_samples method. Only valid if the final estimator implements score_samples.

Parameters:: X (iterable) – Data to predict on. Must fulfill input requirements of first step of the pipeline.
Returns:: y_score – Result of calling score_samples on the final estimator.
Return type:: ndarray of shape (n_samples,)

set_output(*, transform=None)#

Set the output container when “transform” and “fit_transform” are called.

Calling set_output will set the output of all estimators in steps.

Parameters:

transform ({"default", "pandas", "polars"}, default=None) –

Configure output of transform and fit_transform.

”default”: Default output format of a transformer
”pandas”: DataFrame output
”polars”: Polars output
None: Transform configuration is unchanged

New in version 1.4: “polars” option was added.

Returns:

self – Estimator instance.

Return type:

estimator instance

set_params(**kwargs)#

Set the parameters of this estimator.

Valid parameter keys can be listed with get_params(). Note that you can directly set the parameters of the estimators contained in steps.

Parameters:: **kwargs (dict) – Parameters of this estimator or parameters of estimators contained in steps. Parameters of the steps may be set using its name and the parameter name separated by a ‘__’.
Returns:: self – Pipeline class instance.
Return type:: object

set_score_request(*, sample_weight='$UNCHANGED$')#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
self (RePipeline) –

Returns:

self – The updated object.

Return type:

object

transform(X, y=None)#

Apply transforms to the data, and the transform method of the final estimator.

Parameters:

X (pd.DataFrame) – Input data to transform.
y (pd.DataFrame (optional)) – Target values.

Returns:

Transformed feature data.

Return type:

Tuple[pd.Dataframe, pd.DataFrame]

PointsModel Module#

class remodels.pointsModels.PointModel.PointModel(pipeline, variables_per_hour={}, y_column='price_da')#

PointModel is a time-series prediction model designed to forecast electricity prices or similar data.

This model is equipped with a flexible data processing pipeline and the ability to handle different sets of predictor variables for different hours of the day. It offers functionality for model training, prediction with a rolling window approach, and calculation of various evaluation metrics.

Initialize the PointModel with a data processing pipeline, variables mapped to each hour, and the target column name.

Parameters:

pipeline (RePipeline) – Sequence of data transformation steps and a predictive model.
variables_per_hour (dict) – Mapping from hour ranges to the variables to be used in those hours.
y_column (str) – The name of the target column.

calculate_metrics(y_true, y_pred)#

Calculate regression metrics.

Parameters:

y_true (DataFrame) – DataFrame containing the actual data
y_pred (pd.DataFrame) – DataFrame containing the predicted data

Returns:

dict of calculated regression metrics

Return type:

dict

fit(df, start, end)#

Fit the model with the training data.

Parameters:

df (pd.DataFrame) – DataFrame containing the training data.
start (str) – start of fitting
end (str) – end of fitting

fit_transform_data(Xy, is_train=True)#

Fit the transformation pipeline to the data and transform it if is_train is True, otherwise, only transform the data.

Parameters:

Xy (pd.DataFrame) – DataFrame containing features and target to be transformed.
is_train (bool) – Flag to indicate whether to fit the transformer or not.

Returns:

Transformed features and optionally transformed target.

Return type:

tuple or pd.DataFrame

get_hour_variables(hour)#

Retrieve the variables associated with a specific hour based on defined hour ranges.

Parameters:: hour (int) – The hour for which variables are needed.
Returns:: List of variables associated with the given hour.
Return type:: list

predict(calibration_window=728, inverse_predictions=True)#

Predict values over a given range, from start to end, using a rolling window, and store/update predictions in the model.

Parameters:

df (pd.DataFrame) – DataFrame containing the data to be used for prediction.
calibration_window (int) – Number of days to look back for training data.
inverse_predictions (bool) – Flag to determine whether to apply inverse transformation to predictions.

Returns:

DataFrame of predicted values.

Return type:

pd.DataFrame

separate_columns_by_dtype(df)#

Separate columns in a DataFrame by data type (float vs non-float).

Parameters:: df (pandas.DataFrame) – DataFrame to separate columns from.
Returns:: Lists of float columns and non-float columns.
Return type:: tuple

set_unique_hours(dates)#

Set the unique hours for the model based on the provided datetime data.

Parameters:: dates (pd.Series) – Datetime data to extract unique hours from.

summary()#

Generate a summary comparing stored predictions with actual values from the training data.

Returns:: DataFrame with summary metrics.
Return type:: pd.DataFrame

train_and_predict_hours(day, Xy_train, Xy_test, predictions_list, inverse_predictions)#

Train the model and make predictions for each hour in the unique_hours, and store the predictions in a list.

Parameters:

day (pd.Timestamp) – The day for which predictions are made.
Xy_train (pd.DataFrame) – Training data.
Xy_test (pd.DataFrame) – Testing data.
predictions_list (list) – List to store the predictions.
inverse_predictions (bool) – Flag to determine whether to apply inverse transformation to predictions.

QRA Models Module#

class remodels.qra.qra.QRA(quantile=0.5, fit_intercept=False)#

A class that represents the QRA model.

The QRA model is a simple quantile regression model. Fitting a quantile regression model involves solving a minimaztion problem:

\[\hat{\beta_k} = \underset{\beta \in \mathbb{R}^n}{\operatorname{argmin}} \left\{ \sum_{i=1}^{t} \rho_k (Y_i - X_i \beta) \right\}\]

where $\rho_k$ is given by:

\[\rho_k (e) = e (k - {1}_{(e < 0)} )\]

and $k$ is the fixed quantile.

Initialize the QRA model.

Parameters:

quantile (float) – quantile
fit_intercept (bool, optional) – True if fit intercept in model, defaults to False

fit(X, y)#

Fit the model to the data.

Parameters:

X (np.array) – input matrix
y (np.array) – dependent variable

Returns:

fitted model

Return type:

QRA

predict(X)#

Predict the dependent variable.

Parameters:: X (np.array) – input matrix
Returns:: prediction
Return type:: np.array

class remodels.qra.qrm.QRM(quantile=0.5, fit_intercept=False)#

A class that represents the QRM model.

In the QRM model, the average of the input variables is first calculated. This average is used to fit the quantile regression model.

Initialize the QRM model.

Parameters:

quantile (float) – quantile
fit_intercept (bool, optional) – True if fit intercept in model, defaults to False

fit(X, y)#

Fit the model to the data.

Parameters:

X (np.array) – input matrix
y (np.array) – dependent variable

Returns:

fitted model

Return type:

QRM

class remodels.qra.lqra.LQRA(quantile=0.5, lambda_=0.0, fit_intercept=False)#

A class that represents the LQRA model.

The LQRA model is a quantile regression model with a linear penalty factor added to the loss function:

\[\hat{\beta_k} = \underset{\beta \in \mathbb{R}^n}{\operatorname{argmin}} \left\{ \sum_{i=1}^{t} \rho_k (Y_i - X_i \beta) + \lambda \sum_{i=1}^{n} |\beta_i| \right\}\]

where $\lambda$ is a regularization parameter.

Initialize the LQRA model.

Parameters:

quantile (float) – quantile
fit_intercept (bool, optional) – True if fit intercept in model, defaults to False
lambda (float) – LASSO regularization parameter
lambda_ (float) –

fit(X, y)#

Fit the model to the data.

Parameters:

X (np.array) – input matrix
y (np.array) – dependent variable

Returns:

fitted model

Return type:

LQRA

class remodels.qra.fqra.FQRA(quantile=0.5, n_factors=None, fit_intercept=False)#

A class that represents the FQRA model.

In the FQRA model, factors summarizing the information in the input variables are estimated with PCA method. Selected number of factors is used to fit the quantile regression model.

Initialize the FQRA model.

Parameters:

quantile (float) – quantile
n_factors (int) – number of factors (principal components) to use; if None, number of factors is selected automatically using Bayesian information criterion
fit_intercept (bool, optional) – True if fit intercept in model, defaults to False

fit(X, y)#

Fit the model to the data.

Parameters:

X (np.array) – input matrix
y (np.array) – dependent variable

Returns:

fitted model

Return type:

FQRA

class remodels.qra.fqrm.FQRM(quantile=0.5, n_factors=None, fit_intercept=False)#

A class that represents the FQRM model.

In the FQRM model, factors summarizing the information in the input variables are estimated with PCA method. The average of the calculated factors is used to fit the quantile regression model.

Initialize FQRM model.

Parameters:

quantile (float) – quantile
n_factors (int) – number of factors (principal components) to use; if None, number of factors is selected automatically using Bayesian information criterion
fit_intercept (bool, optional) – True if fit intercept in model, defaults to False

fit(X, y)#

Fit the model to the data.

Parameters:

X (np.array) – input matrix
y (np.array) – dependent variable

Returns:

fitted model

Return type:

FQRM

class remodels.qra.sfqra.sFQRA(quantile=None, n_factors=None, fit_intercept=False)#

A class that represents the FQRA model.

The sFQRA model is an FQRA model where the input variables are standardized by subtracting the mean and dividing by the standard deviation (calculated across rows).

Initialize the sFQRA model.

Parameters:

quantile (float) – quantile
n_factors (int) – number of factors (principal components) to use; if None, number of factors is selected automatically using Bayesian information criterion
fit_intercept (bool, optional) – True if fit intercept in model, defaults to False

fit(X, y)#

Fit the model to the data.

Parameters:

X (np.array) – input matrix
y (np.array) – dependent variable

Returns:

fitted model

Return type:

sFQRA

class remodels.qra.sfqrm.sFQRM(quantile=None, n_factors=None, fit_intercept=False)#

A class that represents the FQRA model.

The sFQRM model is an FQRM model where the input variables are standardized by subtracting the mean and dividing by the standard deviation (calculated across rows).

Initialize the sFQRM model.

Parameters:

quantile (float) – quantile
n_factors (int) – number of factors (principal components) to use; if None, number of factors is selected automatically using Bayesian information criterion
fit_intercept (bool, optional) – True if fit intercept in model, defaults to False

fit(X, y)#

Fit the model to the data.

Parameters:

X (np.array) – input matrix
y (np.array) – dependent variable

Returns:

fitted model

Return type:

sFQRM

class remodels.qra.sqra.SQRA(quantile=0.5, H=None, fit_intercept=False)#

A class that represents the SQRA model.

The SQRA model is a QRA model with a loss function smoothed by a kernel density estimator:

\[\hat{\beta_k} = \underset{\beta \in \mathbb{R}^n}{\operatorname{argmin}} \left\{ \sum_{i=1}^{t} \left( H \cdot \phi \left( \frac{Y_i - X_i \beta}{H} \right) + \left( k - \Phi \left( - \frac{Y_i - X_i \beta}{H}\right) \right) \left( Y_i - X_i \beta \right) \right) \right\}\]

where $H$ is a bandwidth parameter.

Initialize the SQRA model.

Parameters:

quantile (float) – quantile
fit_intercept (bool, optional) – True if fit intercept in model, defaults to False
H (float) – smoothing parameter called the bandwidth, must be positive real number; if None, it is automatically estimated using Scott’s (or Silverman’s) rule-of-thumb

fit(X, y)#

Fit the model to the data.

Parameters:

X (np.array) – input matrix
y (np.array) – dependent variable

Returns:

fitted model

Return type:

SQRA

class remodels.qra.sqrm.SQRM(quantile=0.5, H=None, fit_intercept=False)#

A class that represents the SQRM model.

In the SQRM model, the average of the input variables is first calculated. This average is used to fit the SQRA model.

Initialize the SQRM model.

Parameters:

quantile (float) – quantile
fit_intercept (bool, optional) – True if fit intercept in model, defaults to False
H (float) – smoothing parameter called the bandwidth

fit(X, y)#

Fit the model to the data.

Parameters:

X (np.array) – input matrix
y (np.array) – dependent variable

Returns:

fitted model

Return type:

SQRM

QR Tester SubModule#

class remodels.qra.tester.qr_tester.QR_Tester(calibration_window=168, prediction_window=24, multivariate=True, qr_model=<remodels.qra.qra.QRA object>, max_workers=None, progress=True)#

QR Tester is a class for testing QR models.

QR Tester class is a class designed to obtain probabilistic predictions using a given QR model.

The QR model is fitted to the data portion specified by the calibration window. Then, the QR model predictions for all percentiles are calculated. The process is repeated for subsequent portions of data.

Initialize the QR Tester.

Parameters:

calibration_window (int, optional) – length of calibration window, defaults to 7 * 24
prediction_window (int, optional) – length of prediction window, defaults to 24
qr_model (QRA, optional) – QR model, defaults to QRA(fit_intercept=True)
max_workers (int, optional) – process pool executor max workers, defaults to None
multivariate (bool) –
progress (bool) –

fit_predict(X, y)#

Run QR Tester to obtain probabilistic predictions wrapped in special results class.

Parameters:

X (np.array) – data matrix
y (np.array) – endogenous variable

Returns:

QR_TestResults object

Return type:

QR_TestResults

class remodels.qra.tester.qr_tester.QR_TestResults(Y_pred, y_test, prediction_window)#

A class that wraps probabilistic predictions.

The QR Test Results allows you to calculate metric values.

Initialize the QR Test Results class.

Parameters:

Y_pred (np.array) – matrix of predictions
y_test (np.array) – endogenous variable
prediction_window (int) – length of prediction window

aec(alpha)#

Average empirical coverage.

Parameters:: alpha (int) – length of prediction interval
Returns:: average empirical coverage value
Return type:: float

aps()#

Aggregate pinball score.

Returns:: aggregate pinball score value
Return type:: float

aps_extreme_quantiles(n_quantiles)#

Aggregate pinball score for n extreme quantiles.

Parameters:: n_quantiles (int) – number of leftmost and rightmost quantiles
Returns:: aggregate pinball score computed for extreme quantiles
Return type:: float

christoffersen_test(alpha, significance_level=0.05)#

Christoffersen test. Count the number of times the null hypothesis is not rejected.

Parameters:

alpha (int) – length of predition interval
significance_level (float, optional) – test significance level, defaults to 0.05

Returns:

number of hours that test is not rejected

Return type:

int

ec_h(alpha)#

Empirical coverage per ‘hour’.

Parameters:: alpha (int) – length of prediction interval
Returns:: emipirical coverage per hour values
Return type:: np.array

ec_mad(alpha)#

Empirical coverage per ‘hour’ - mean absolute deviation.

Parameters:: alpha (int) – length of prediction interval
Returns:: mean absolute deviation of empirical coverage per hour values
Return type:: float

kupiec_test(alpha, significance_level=0.05)#

Kupiec test. Count the number of times the null hypothesis is not rejected.

Parameters:

alpha (int) – length of predition interval
significance_level (float, optional) – test significance level, defaults to 0.05

Returns:

number of hours that test is not rejected

Return type:

int

class remodels.qra.tester.qr_results_summary.QR_ResultsSummary(results_dict)#

A class to summarize QR Test Results.

Initialize the QR Results Summary class.

Parameters:: results_dict (Dict[str, Dict[str, QR_TestResults]]) – dictionary of results by dataset and QR variant

aec(alpha_list)#

Average empirical coverage.

Parameters:: alpha_list (List[int]) – length of prediction interval list
Returns:: aec summary
Return type:: pd.DataFrame

aps()#

Aggregate pinball score.

Returns:: aggregate pinball score summary
Return type:: pd.DataFrame

aps_extreme_quantiles(n_quantiles)#

Aggregate pinball score for n extreme quantiles.

Parameters:: n_quantiles (int) – number of leftmost and rightmost quantiles
Returns:: aggregate pinball score computed for extreme quantiles summary
Return type:: pd.DataFrame

kupiec_test(alpha_list)#

Kupiec test.

Parameters:: alpha_list (List[int]) – length of prediction interval list
Returns:: Kupiec test summary
Return type:: pd.DataFrame

Reference#

Package remodels#

Data Module#

Transformers Module#

VSTransformers SubModule#

Pipeline Module#

PointsModel Module#

QRA Models Module#

QR Tester SubModule#

Package `remodels`#