Codes¶
sysidentpy base¶
Base classes for NARX estimator.
-
class
sysidentpy.base.
GenerateRegressors
[source]¶ Polynomial NARMAX model
Provides the main functions to generate the regressor dictionary and regressor codes for polynomial basis.
-
regressor_space
(non_degree, xlag, ylag, n_inputs)[source]¶ Create the code representation of the regressors.
This function generates a codification from all possibles regressors given the maximum lag of the input and output. This is used to write the final terms of the model in a readable form. [1001] -> y(k-1). This code format was based on a dissertation from UFMG. See reference below.
- Parameters
non_degree (int) – The desired maximum nonlinearity degree.
ylag (int) – The maximum lag of output regressors.
xlag (int) – The maximum lag of input regressors.
- Returns
max_lag (int) – This value can be used by another functions.
regressor_code (ndarray of int) – Matrix codification of all possible regressors.
Examples
The codification is defined as:
>>> 100n = y(k-n) >>> 200n = u(k-n) >>> [100n 100n] = y(k-n)y(k-n) >>> [200n 200n] = u(k-n)u(k-n)
References
- [1] Master Thesis: Barbosa, Alípio Monteiro.
Técnicas de otimizaçao bi-objetivo para a determinaçao da estrutura de modelos NARX (2010).
-
-
class
sysidentpy.base.
InformationMatrix
[source]¶ Class for methods regarding preprocessing of columns
-
shift_column
(col_to_shift, lag)[source]¶ Shift values based on a lag.
- Parameters
col_to_shift (array-like of shape = n_samples) – The samples of the input or output.
lag (int) – The respective lag of the regressor.
- Returns
tmp_column – The shifted array of the input or output.
- Return type
array-like of shape = n_samples
Examples
>>> y = [1, 2, 3, 4, 5] >>> shift_column(y, 1) [0, 1, 2, 3, 4]
-
initial_lagged_matrix
(X, y, xlag, ylag)[source]¶ Build a lagged matrix concerning each lag for each column.
- Parameters
model (ndarray of int) – The model code representation.
y (array-like) – Target data used on training phase.
X (array-like) – Input data used on training phase.
ylag (int) – The maximum lag of output regressors.
xlag (int) – The maximum lag of input regressors.
- Returns
lagged_data – The lagged matrix built in respect with each lag and column.
- Return type
ndarray of floats
Examples
Let X and y be the input and output values of shape Nx1. If the chosen lags are 2 for both input and output the initial lagged matrix will be formed by Y[k-1], Y[k-2], X[k-1], and X[k-2].
-
build_information_matrix
(X, y, xlag, ylag, non_degree)[source]¶ Build the information matrix.
Each columns of the information matrix represents a candidate regressor. The set of candidate regressors are based on xlag, ylag, and non_degree entered by the user.
- Parameters
model (ndarray of int) – The model code representation.
y (array-like) – Target data used on training phase.
X (array-like) – Input data used on training phase.
ylag (int) – The maximum lag of output regressors.
xlag (int) – The maximum lag of input regressors.
non_degree (int) – The desired maximum nonlinearity degree.
- Returns
The lagged matrix built in respect with each lag and column.
- Return type
lagged_data = ndarray of floats
-
sysidentpy narmax¶
Build Polynomial NARMAX Models
-
class
sysidentpy.polynomial_basis.narmax.
PolynomialNarmax
(non_degree=2, ylag=2, xlag=2, order_selection=False, info_criteria='aic', n_terms=None, n_inputs=1, n_info_values=10, estimator='recursive_least_squares', extended_least_squares=True, aux_lag=1, lam=0.98, delta=0.01, offset_covariance=0.2, mu=0.01, eps=2.220446049250313e-16, gama=0.2, weight=0.02)[source]¶ Polynomial NARXMAX model
- Parameters
non_degree (int, default=2) – The nonlinearity degree of the polynomial function.
ylag (int, default=2) – The maximum lag of the output.
xlag (int, default=2) – The maximum lag of the input.
order_selection (bool, default=False) – Whether to use information criteria for order selection.
info_criteria (str, default="aic") – The information criteria method to be used.
n_terms (int, default=None) – The number of the model terms to be selected. Note that n_terms overwrite the information criteria values.
n_inputs (int, default=1) – The number of inputs of the system.
n_info_values (int, default=10) – The number of iterations of the information criteria method.
estimator (str, default="least_squares") – The parameter estimation method.
extended_least_squres (bool, default=False) – Whether to use extended least squres method for parameter estimation. Note that we define a specific set of noise regressors.
aux_lag (int, default=1) – Temporary lag value used only for parameter estimation. This value is overwriten by the max_lag value and will be removed in v0.1.4.
lam (float, default=0.98) – Forgetting factor of the Recursive Least Squares method.
delta (float, default=0.01) – Normalization factor of the P matrix.
offset_covariance (float, default=0.2) – The offset covariance factor of the affine least mean squares filter.
mu (float, defaul=0.01) – The convergence coefficient (learning rate) of the filter.
eps (float) – Normalization factor of the normalized filters.
gama (float, default=0.2) – The leakage factor of the Leaky LMS method.
weight (float, default=0.02) – Weight factor to control the proportions of the error norms and offers an extra degree of freedom within the adaptation of the LMS mixed norm method.
Examples
>>> import numpy as np >>> import matplotlib.pyplot as plt >>> from sysidentpy.polynomial_basis import PolynomialNarmax >>> from sysidentpy.metrics import root_relative_squared_error >>> from sysidentpy.utils.generate_data import get_miso_data, get_siso_data >>> x_train, x_valid, y_train, y_valid = get_siso_data(n=1000, ... colored_noise=True, ... sigma=0.2, ... train_percentage=90) >>> model = PolynomialNarmax(non_degree=2, ... order_selection=True, ... n_info_values=10, ... extended_least_squares=False, ... ylag=2, xlag=2, ... info_criteria='aic', ... estimator='least_squares', ... ) >>> model.fit(x_train, y_train) >>> yhat = model.predict(x_valid, y_valid) >>> rrse = root_relative_squared_error(y_valid, yhat) >>> print(rrse) 0.001993603325328823 >>> results = pd.DataFrame(model.results(err_precision=8, ... dtype='dec'), ... columns=['Regressors', 'Parameters', 'ERR']) >>> print(results) Regressors Parameters ERR 0 x1(k-2) 0.9000 0.95556574 1 y(k-1) 0.1999 0.04107943 2 x1(k-1)y(k-1) 0.1000 0.00335113
References
- [1] Manuscript: Orthogonal least squares methods and their application
to non-linear system identification https://eprints.soton.ac.uk/251147/1/778742007_content.pdf
- [2] Manuscript (portuguese): Identificação de Sistemas não Lineares
Utilizando Modelos NARMAX Polinomiais–Uma Revisão e Novos Resultados
-
error_reduction_ratio
(psi, y, process_term_number)[source]¶ Perform the Error Reduction Ration algorithm.
- Parameters
y (array-like of shape = n_samples) – The target data used in the identification process.
psi (ndarray of floats) – The information matrix of the model.
process_term_number (int) – Number of Process Terms defined by the user.
- Returns
err (array-like of shape = number_of_model_elements) – The respective ERR calculated for each regressor.
piv (array-like of shape = number_of_model_elements) – Contains the index to put the regressors in the correct order based on err values.
psi_orthogonal (ndarray of floats) – The updated and orthogonal information matrix.
References
- [1] Manuscript: Orthogonal least squares methods and their application
to non-linear system identification https://eprints.soton.ac.uk/251147/1/778742007_content.pdf
- [2] Manuscript (portuguese): Identificação de Sistemas não Lineares
Utilizando Modelos NARMAX Polinomiais–Uma Revisão e Novos Resultados
-
fit
(X, y)[source]¶ Fit polynomial NARMAX model.
This is an ‘alpha’ version of the ‘fit’ function which allows a friendly usage by the user. Given two arguments, X and y, fit training data.
- Parameters
X (ndarray of floats) – The input data to be used in the training process.
y (ndarray of floats) – The output data to be used in the training process.
- Returns
model (ndarray of ints) – The model code represetation.
piv (array-like of shape = number_of_model_elements) – Contains the index to put the regressors in the correct order based on err values.
theta (array-like of shape = number_of_model_elements) – The estimated parameters of the model.
err (array-like of shape = number_of_model_elements) – The respective ERR calculated for each regressor.
info_values (array-like of shape = n_regressor) – Vector with values of akaike’s information criterion for models with N terms (where N is the vector position + 1).
-
predict
(X, y, steps_ahead=None)[source]¶ Return the predicted values given an input.
The predict function allows a friendly usage by the user. Given a previously trained model, predict values given a new set of data.
This method accept y values mainly for prediction n-steps ahead (to be implemented in the future)
- Parameters
X (ndarray of floats) – The input data to be used in the prediction process.
y (ndarray of floats) – The output data to be used in the prediction process.
= int (default = None) (steps_ahead) – The forecast horizon.
- Returns
yhat – The predicted values of the model.
- Return type
ndarray of floats
-
information_criterion
(X, y)[source]¶ Determine the model order.
This function uses a information criterion to determine the model size. ‘Akaike’- Akaike’s Information Criterion with
critical value 2 (AIC) (default).
‘Bayes’ - Bayes Information Criterion (BIC). ‘FPE’ - Final Prediction Error (FPE). ‘LILC’ - Khundrin’s law ofiterated logarithm criterion (LILC).
- Parameters
y (array-like of shape = n_samples) – Target values of the system.
X (array-like of shape = n_samples) – Input system values measured by the user.
- Returns
output_vector – Vector with values of akaike’s information criterion for models with N terms (where N is the vector position + 1).
- Return type
array-like of shape = n_regressor
References
-
results
(theta_precision=4, err_precision=8, dtype='dec')[source]¶ Write the model regressors, parameters and ERR values.
This function returns the model regressors, its respectives parameter and ERR value on a string matrix.
- Parameters
theta_precision (int (default: 4)) – Precision of shown parameters values.
err_precision (int (default: 8)) – Precision of shown ERR values.
dtype (string (default: 'dec')) – Type of representation: sci - Scientific notation; dec - Decimal notation.
- Returns
output_matrix –
- Where:
First column represents each regressor element; Second column represents associated parameter; Third column represents the error reduction ratio associated to each regressor.
- Return type
string
-
compute_info_value
(n_theta, n_samples, e_var)[source]¶ Compute the information criteria value.
This function returns the information criteria concerning each number of regressor. The informotion criteria can be AIC, BIC, LILC and FPE.
- Parameters
n_theta (int) – Number of parameters of the model.
n_samples (int) – Number of samples given the maximum lag.
e_var (float) – Variance of the residues
- Returns
info_criteria_value – The computed value given the information criteria selected by the user.
- Return type
float
sysidentpy simulation¶
-
class
sysidentpy.polynomial_basis.simulation.
SimulatePolynomialNarmax
(n_inputs=1, estimator='recursive_least_squares', extended_least_squares=True, lam=0.98, delta=0.01, offset_covariance=0.2, mu=0.01, eps=2.220446049250313e-16, gama=0.2, weight=0.02, estimate_parameter=False)[source]¶ -
simulate
(X_train=None, y_train=None, X_test=None, y_test=None, model_code=None, steps_ahead=None, theta=None, plot=True)[source]¶ Simulate a model defined by the user.
- Parameters
X_train (ndarray of floats) – The input data to be used in the training process.
y_train (ndarray of floats) – The output data to be used in the training process.
X_test (ndarray of floats) – The input data to be used in the prediction process.
y_test (ndarray of floats) – The output data (initial conditions) to be used in the prediction process.
model_code (ndarray of int) – Flattened list of input or output regressors.
= int (steps_ahead) – The forecast horizon.
= None (default) – The forecast horizon.
theta (array-like of shape = number_of_model_elements) – The parameters of the model.
plot (bool, default=True) – Indicate if the user wants to plot or not.
- Returns
yhat (ndarray of floats) – The predicted values of the model.
results (string) –
- Where:
First column represents each regressor element; Second column represents associated parameter; Third column represents the error reduction ratio associated to each regressor.
-
sysidentpy narx_neural_network¶
sysidentpy general_estimators¶
Build NARX Models Using general estimators
-
class
sysidentpy.general_estimators.narx.
NARX
(non_degree=1, ylag=2, xlag=2, n_inputs=1, base_estimator=None, fit_params={})[source]¶ NARX model build on top of general estimators
Currently is possible to use any estimator that have a fit/predict as an Autoregressive Model. We use our GenerateRegressors and InformationMatrix classes to handle the creation of the lagged features and we are able to use a simple fit and prediction function to run infinity-steps-ahead prediction.
- Parameters
non_degree (int, default=1) – The nonlinearity degree of the polynomial function.
ylag (int, default=2) – The maximum lag of the output.
xlag (int, default=2) – The maximum lag of the input.
n_inputs (int, default=1) – The number of inputs of the system.
fit_params (dict, default=None) – Optional parameters of the fit function of the baseline estimator
base_estimator (default=None) – The defined base estimator of the sklearn
verbose (bool, default=False) – Print messages
Examples
>>> import numpy as np >>> import pandas as pd >>> import matplotlib.pyplot as plt >>> from sysidentpy.metrics import mean_squared_error >>> from sysidentpy.utils.generate_data import get_siso_data >>> from sysidentpy.general_estimators import NARX >>> from sklearn.linear_model import BayesianRidge # to use as base estimator >>> x_train, x_valid, y_train, y_valid = get_siso_data(n=1000, >>> colored_noise=False, >>> sigma=0.01, >>> train_percentage=80) >>> BayesianRidge_narx = NARX(base_estimator=BayesianRidge(), ... xlag=2, ... ylag=2 ... ) >>> BayesianRidge_narx.fit(x_train, y_train) >>> yhat = BayesianRidge_narx.predict(x_valid, y_valid) >>> print(mean_squared_error(y_valid, yhat)) 0.000131
-
data_preparation
(X, y)[source]¶ Return the lagged matrix and the y values given the maximum lags.
- Parameters
X (ndarray of floats) – The input data.
y (ndarray of floats) – The output data.
- Returns
y (ndarray of floats) – The y values considering the lags.
reg_matrix (ndarray of floats) – The information matrix of the model.
-
fit
(X, y)[source]¶ Train a NARX Neural Network model.
This is an training pipeline that allows a friendly usage by the user. All the lagged features are built using the SysIdentPy classes and we use the fit method of the base estimator of the sklearn to fit the model.
- Parameters
X (ndarrays of floats) – The input data to be used in the training process.
y (ndarrays of floats) – The output data to be used in the training process.
- Returns
base_estimator – The model fitted.
- Return type
sklearn estimator
-
predict
(X, y_initial)[source]¶ Return the predicted given an input and initial values.
The predict function allows a friendly usage by the user. Given a trained model, predict values given a new set of data.
This method accept y values mainly for prediction n-steps ahead (to be implemented in the future).
Currently we only support infinity-steps-ahead prediction, but run 1-step-ahead prediction manually is straightforward.
- Parameters
X (ndarray of floats) – The input data to be used in the prediction process.
y (ndarray of floats) – The output data to be used in the prediction process.
- Returns
yhat – The predicted values of the model.
- Return type
ndarray of floats
sysidentpy residues¶
-
class
sysidentpy.residues.residues_correlation.
ResiduesAnalysis
[source]¶ Bases:
object
Residues analysis for Polynomial NARX model.
-
residuals
(X, y, yhat)[source]¶ Performs the residual analysis of output to validate model.
- Parameters
y (array-like of shape = n_samples) – The target data used in the identification process.
yhat (array-like of shape = n_samples) – The prediction values of the identification process.
X (ndarray of floats) – The input data.
- Returns
output_autocorr (ndarray of floats:) – 1st column - Residuals normalized autocorrelation. 2nd/3rd columns - Superior and inferior limits of a 95% confidence interval.
output_crosscorr (ndarray of floats:) – 1st column - Correlation between residuals and input. 2nd/3rd columns - Superior and inferior limits of a 95% confidence interval.
Examples
>>> y = [3, -0.5, 2, 7] >>> autocorr(y) [62.25 11.5 2.5 21. ]
-
plot_result
(y, yhat, e_acf, xe_ccf, figsize=(10, 8), n=100)[source]¶ Plot the free run simulation and residues analysis.
- Parameters
y (array-like of shape = n_samples) – The target data used in the identification process.
yhat (array-like of shape = n_samples) – The prediction values of the identification process.
e_acf (ndarray of floats:) – 1st column - Residuals normalized autocorrelation. 2nd/3rd columns - Superior and inferior limits of a 95% confidence interval.
xe_ccf (ndarray of floats:) – 1st column - Correlation between residuals and input. 2nd/3rd columns - Superior and inferior limits of a 95% confidence interval.
-
__dict__
= mappingproxy({'__module__': 'sysidentpy.residues.residues_correlation', '__doc__': 'Residues analysis for Polynomial NARX model.', 'residuals': <function ResiduesAnalysis.residuals>, '_input_ccf': <function ResiduesAnalysis._input_ccf>, '_residuals_acf': <function ResiduesAnalysis._residuals_acf>, '_normalized_correlation': <function ResiduesAnalysis._normalized_correlation>, 'plot_result': <function ResiduesAnalysis.plot_result>, '__dict__': <attribute '__dict__' of 'ResiduesAnalysis' objects>, '__weakref__': <attribute '__weakref__' of 'ResiduesAnalysis' objects>, '__annotations__': {}})¶
-
__module__
= 'sysidentpy.residues.residues_correlation'¶
-
__weakref__
¶ list of weak references to the object (if defined)
-
sysidentpy metrics¶
Common metrics to assess performance on NARX models.
-
sysidentpy.metrics._regression.
forecast_error
(y, y_predicted)[source]¶ Calculate the forecast error in a regression model.
- Parameters
y (array-like of shape = number_of_outputs) – Represent the target values.
y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.
- Returns
loss – The difference between the true target values and the predicted or forecast value in regression or any other phenomenon.
- Return type
ndarray of floats
References
- [1] Wikipedia entry on the Forecast error
Examples
>>> y = [3, -0.5, 2, 7] >>> y_predicted = [2.5, 0.0, 2, 8] >>> forecast_error(y, y_predicted) [0.5, -0.5, 0, -1]
-
sysidentpy.metrics._regression.
mean_forecast_error
(y, y_predicted)[source]¶ Calculate the mean of forecast error of a regression model.
- Parameters
y (array-like of shape = number_of_outputs) – Represent the target values.
y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.
- Returns
loss – The mean value of the difference between the true target values and the predicted or forecast value in regression or any other phenomenon.
- Return type
float
References
- [1] Wikipedia entry on the Forecast error
Examples
>>> y = [3, -0.5, 2, 7] >>> y_predicted = [2.5, 0.0, 2, 8] >>> mean_forecast_error(y, y_predicted) -0.25
-
sysidentpy.metrics._regression.
mean_squared_error
(y, y_predicted)[source]¶ Calculate the Mean Squared Error.
- Parameters
y (array-like of shape = number_of_outputs) – Represent the target values.
y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.
- Returns
loss – MSE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.
- Return type
float
References
- [1] Wikipedia entry on the Mean Squared Error
Examples
>>> y = [3, -0.5, 2, 7] >>> y_predicted = [2.5, 0.0, 2, 8] >>> mean_squared_error(y, y_predicted) 0.375
-
sysidentpy.metrics._regression.
root_mean_squared_error
(y, y_predicted)[source]¶ Calculate the Root Mean Squared Error.
- Parameters
y (array-like of shape = number_of_outputs) – Represent the target values.
y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.
- Returns
loss – RMSE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.
- Return type
float
References
- [1] Wikipedia entry on the Root Mean Squared Error
Examples
>>> y = [3, -0.5, 2, 7] >>> y_predicted = [2.5, 0.0, 2, 8] >>> root_mean_squared_error(y, y_predicted) 0.612
-
sysidentpy.metrics._regression.
normalized_root_mean_squared_error
(y, y_predicted)[source]¶ Calculate the normalized Root Mean Squared Error.
- Parameters
y (array-like of shape = number_of_outputs) – Represent the target values.
y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.
- Returns
loss – nRMSE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.
- Return type
float
References
- [1] Wikipedia entry on the normalized Root Mean Squared Error
Examples
>>> y = [3, -0.5, 2, 7] >>> y_predicted = [2.5, 0.0, 2, 8] >>> normalized_root_mean_squared_error(y, y_predicted) 0.081
-
sysidentpy.metrics._regression.
root_relative_squared_error
(y, y_predicted)[source]¶ Calculate the Root Relative Mean Squared Error.
- Parameters
y (array-like of shape = number_of_outputs) – Represent the target values.
y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.
- Returns
loss – RRSE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.
- Return type
float
Examples
>>> y = [3, -0.5, 2, 7] >>> y_predicted = [2.5, 0.0, 2, 8] >>> root_relative_mean_squared_error(y, y_predicted) 0.206
-
sysidentpy.metrics._regression.
mean_absolute_error
(y, y_predicted)[source]¶ Calculate the Mean absolute error.
- Parameters
y (array-like of shape = number_of_outputs) – Represent the target values.
y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.
- Returns
loss – MAE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.
- Return type
float or ndarray of floats
References
- [1] Wikipedia entry on the Mean absolute error
Examples
>>> y = [3, -0.5, 2, 7] >>> y_predicted = [2.5, 0.0, 2, 8] >>> mean_absolute_error(y, y_predicted) 0.5
-
sysidentpy.metrics._regression.
mean_squared_log_error
(y, y_predicted)[source]¶ Calculate the Mean Squared Logarithmic Error.
- Parameters
y (array-like of shape = number_of_outputs) – Represent the target values.
y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.
- Returns
loss – MSLE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.
- Return type
float
Examples
>>> y = [3, 5, 2.5, 7] >>> y_predicted = [2.5, 5, 4, 8] >>> mean_squared_log_error(y, y_predicted) 0.039
-
sysidentpy.metrics._regression.
median_absolute_error
(y, y_predicted)[source]¶ Calculate the Median Absolute Error.
- Parameters
y (array-like of shape = number_of_outputs) – Represent the target values.
y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.
- Returns
loss – MdAE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.
- Return type
float
References
- [1] Wikipedia entry on the Median absolute deviation
Examples
>>> y = [3, -0.5, 2, 7] >>> y_predicted = [2.5, 0.0, 2, 8] >>> median_absolute_error(y, y_predicted) 0.5
-
sysidentpy.metrics._regression.
explained_variance_score
(y, y_predicted)[source]¶ Calculate the Explained Variance Score.
- Parameters
y (array-like of shape = number_of_outputs) – Represent the target values.
y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.
- Returns
loss – EVS output is non-negative values. Becoming 1.0 means your model outputs are exactly matched by true target values. Lower values means worse results.
- Return type
float
References
- [1] Wikipedia entry on the Explained Variance
Examples
>>> y = [3, -0.5, 2, 7] >>> y_predicted = [2.5, 0.0, 2, 8] >>> explained_variance_score(y, y_predicted) 0.957
-
sysidentpy.metrics._regression.
r2_score
(y, y_predicted)[source]¶ Calculate the R2 score.
- Parameters
y (array-like of shape = number_of_outputs) – Represent the target values.
y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.
- Returns
loss – R2 output can be non-negative values or negative value. Becoming 1.0 means your model outputs are exactly matched by true target values. Lower values means worse results.
- Return type
float
Notes
This is not a symmetric function.
References
- [1] Wikipedia entry on the Coefficient of determination
Examples
>>> y = [3, -0.5, 2, 7] >>> y_predicted = [2.5, 0.0, 2, 8] >>> explained_variance_score(y, y_predicted) 0.948
-
sysidentpy.metrics._regression.
symmetric_mean_absolute_percentage_error
(y, y_predicted)[source]¶ Calculate the SMAPE score.
- Parameters
y (array-like of shape = number_of_outputs) – Represent the target values.
y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.
- Returns
loss – SMAPE output is a non-negative value. The results are percentages values.
- Return type
float
Notes
One supposed problem with SMAPE is that it is not symmetric since over-forecasts and under-forecasts are not treated equally.
References
- [1] Wikipedia entry on the Symmetric mean absolute percentage error
https://en.wikipedia.org/wiki/Symmetric_mean_absolute_percentage_error
Examples
>>> y = [3, -0.5, 2, 7] >>> y_predicted = [2.5, 0.0, 2, 8] >>> symmetric_mean_absolute_percentage_error(y, y_predicted) 57.87
sysidentpy estimators¶
Least Squares Methodos for parameter estimation
-
class
sysidentpy.parameter_estimation.estimators.
Estimators
(aux_lag=1, lam=0.98, delta=0.01, offset_covariance=0.2, mu=0.01, eps=2.220446049250313e-16, gama=0.2, weight=0.02)[source]¶ Oridanry Least squares for linear parameter estimation
-
least_squares
(psi, y)[source]¶ Estimate the model parameters using Least Squares method.
- Parameters
psi (ndarray of floats) – The information matrix of the model.
y_train (array-like of shape = y_training) – The data used to training the model.
- Returns
theta – The estimated parameters of the model.
- Return type
array-like of shape = number_of_model_elements
References
- [1] Manuscript: Sorenson, H. W. (1970). Least-squares estimation:
from Gauss to Kalman. IEEE spectrum, 7(7), 63-68. http://pzs.dstu.dp.ua/DataMining/mls/bibl/Gauss2Kalman.pdf
- [2] Book (Portuguese): Aguirre, L. A. (2007). Introduçaoa identificaçao
de sistemas: técnicas lineares enao-lineares aplicadas a sistemas reais. Editora da UFMG. 3a ediçao.
- [3] Manuscript: Markovsky, I., & Van Huffel, S. (2007).
Overview of total least-squares methods. Signal processing, 87(10), 2283-2302. https://eprints.soton.ac.uk/263855/1/tls_overview.pdf
- [4] Wikipedia entry on Least Squares
-
total_least_squares
(psi, y)[source]¶ Estimate the model parameters using Total Least Squares method.
- Parameters
psi (ndarray of floats) – The information matrix of the model.
y_train (array-like of shape = y_training) – The data used to training the model.
- Returns
theta – The estimated parameters of the model.
- Return type
array-like of shape = number_of_model_elements
References
- [1] Manuscript: Golub, G. H., & Van Loan, C. F. (1980).
An analysis of the total least squares problem. SIAM journal on numerical analysis, 17(6), 883-893.
- [2] Manuscript: Markovsky, I., & Van Huffel, S. (2007).
Overview of total least-squares methods. Signal processing, 87(10), 2283-2302. https://eprints.soton.ac.uk/263855/1/tls_overview.pdf
- [3] Wikipedia entry on Total Least Squares
-
recursive_least_squares
(psi, y)[source]¶ Estimate the model parameters using the Recursive Least Squares method.
The implementation consider the forgeting factor. :param psi: The information matrix of the model. :type psi: ndarray of floats :param y_train: The data used to training the model. :type y_train: array-like of shape = y_training
- Returns
theta – The estimated parameters of the model.
- Return type
array-like of shape = number_of_model_elements
Notes
A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.
References
- [1] Book (Portuguese): Aguirre, L. A. (2007). Introduçaoa identificaçao
de sistemas: técnicas lineares enao-lineares aplicadas a sistemas reais. Editora da UFMG. 3a ediçao.
-
affine_least_mean_squares
(psi, y)[source]¶ Estimate the model parameters using the Affine Least Mean Squares.
- Parameters
psi (ndarray of floats) – The information matrix of the model.
y_train (array-like of shape = y_training) – The data used to training the model.
- Returns
theta – The estimated parameters of the model.
- Return type
array-like of shape = number_of_model_elements
Notes
A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.
References
- [1] Book: Poularikas, A. D. (2017). Adaptive filtering: Fundamentals
of least mean squares with MATLAB®. CRC Press.
-
least_mean_squares
(psi, y)[source]¶ Estimate the model parameters using the Least Mean Squares filter.
- Parameters
psi (ndarray of floats) – The information matrix of the model.
y_train (array-like of shape = y_training) – The data used to training the model.
- Returns
theta – The estimated parameters of the model.
- Return type
array-like of shape = number_of_model_elements
Notes
A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.
References
- [1] Book: Haykin, S., & Widrow, B. (Eds.). (2003). Least-mean-square
adaptive filters (Vol. 31). John Wiley & Sons.
- [2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,
análise estatística e novas estratégias de algoritmos LMS de passo variável.
- [3] Wikipedia entry on Least Mean Squares
-
least_mean_squares_sign_error
(psi, y)[source]¶ Parameter estimation using the Sign-Error Least Mean Squares filter.
The sign-error LMS algorithm uses the sign of the error vector to change the filter coefficients.
- Parameters
psi (ndarray of floats) – The information matrix of the model.
y_train (array-like of shape = y_training) – The data used to training the model.
- Returns
theta – The estimated parameters of the model.
- Return type
array-like of shape = number_of_model_elements
Notes
A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.
References
- [1]`Book: Hayes, M. H. (2009). Statistical digital signal processing
and modeling. John Wiley & Sons.
- [2]`Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,
análise estatística e novas estratégias de algoritmos LMS de passo variável.
- [3] Wikipedia entry on Least Mean Squares
-
normalized_least_mean_squares
(psi, y)[source]¶ Parameter estimation using the Normalized Least Mean Squares filter.
The normalization is used to avoid numerical instability when updating the estimated parameters.
- Parameters
psi (ndarray of floats) – The information matrix of the model.
y_train (array-like of shape = y_training) – The data used to training the model.
- Returns
theta – The estimated parameters of the model.
- Return type
array-like of shape = number_of_model_elements
Notes
A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.
References
- [1]`Book: Hayes, M. H. (2009). Statistical digital signal processing
and modeling. John Wiley & Sons.
- [2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,
análise estatística e novas estratégias de algoritmos LMS de passo variável.
- [3] Wikipedia entry on Least Mean Squares
-
least_mean_squares_normalized_sign_error
(psi, y)[source]¶ Parameter estimation using the Normalized Sign-Error LMS filter.
The normalization is used to avoid numerical instability when updating the estimated parameters and the sign of the error vector is used to to change the filter coefficients.
- Parameters
psi (ndarray of floats) – The information matrix of the model.
y_train (array-like of shape = y_training) – The data used to training the model.
- Returns
theta – The estimated parameters of the model.
- Return type
array-like of shape = number_of_model_elements
Notes
A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.
References
- [1] Book: Hayes, M. H. (2009). Statistical digital signal processing
and modeling. John Wiley & Sons.
- [2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,
análise estatística e novas estratégias de algoritmos LMS de passo variável.
- [3] Wikipedia entry on Least Mean Squares
-
least_mean_squares_sign_regressor
(psi, y)[source]¶ Parameter estimation using the Sign-Regressor LMS filter.
The sign-regressor LMS algorithm uses the sign of the matrix information to change the filter coefficients.
- Parameters
psi (ndarray of floats) – The information matrix of the model.
y_train (array-like of shape = y_training) – The data used to training the model.
- Returns
theta – The estimated parameters of the model.
- Return type
array-like of shape = number_of_model_elements
Notes
A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.
References
- [1] Book: Hayes, M. H. (2009). Statistical digital signal processing
and modeling. John Wiley & Sons.
- [2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,
análise estatística e novas estratégias de algoritmos LMS de passo variável.
- [3] Wikipedia entry on Least Mean Squares
-
least_mean_squares_normalized_sign_regressor
(psi, y)[source]¶ Parameter estimation using the Normalized Sign-Regressor LMS filter.
The normalization is used to avoid numerical instability when updating the estimated parameters and the sign of the information matrix is used to change the filter coefficients.
- Parameters
psi (ndarray of floats) – The information matrix of the model.
y_train (array-like of shape = y_training) – The data used to training the model.
- Returns
theta – The estimated parameters of the model.
- Return type
array-like of shape = number_of_model_elements
Notes
A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.
References
- [1] Book: Hayes, M. H. (2009). Statistical digital signal processing
and modeling. John Wiley & Sons.
- [2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,
análise estatística e novas estratégias de algoritmos LMS de passo variável.
- [3] Wikipedia entry on Least Mean Squares
-
least_mean_squares_sign_sign
(psi, y)[source]¶ Parameter estimation using the Sign-Sign LMS filter.
The sign-regressor LMS algorithm uses both the sign of the matrix information and the sign of the error vector to change the filter coefficients.
- Parameters
psi (ndarray of floats) – The information matrix of the model.
y_train (array-like of shape = y_training) – The data used to training the model.
- Returns
theta – The estimated parameters of the model.
- Return type
array-like of shape = number_of_model_elements
Notes
A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.
References
- [1] Book: Hayes, M. H. (2009). Statistical digital signal processing
and modeling. John Wiley & Sons.
- [2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,
análise estatística e novas estratégias de algoritmos LMS de passo variável.
- [3] Wikipedia entry on Least Mean Squares
-
least_mean_squares_normalized_sign_sign
(psi, y)[source]¶ Parameter estimation using the Normalized Sign-Sign LMS filter.
The normalization is used to avoid numerical instability when updating the estimated parameters and both the sign of the information matrix and the sign of the error vector are used to change the filter coefficients.
- Parameters
psi (ndarray of floats) – The information matrix of the model.
y_train (array-like of shape = y_training) – The data used to training the model.
- Returns
theta – The estimated parameters of the model.
- Return type
array-like of shape = number_of_model_elements
Notes
A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.
References
- [1] Book: Hayes, M. H. (2009). Statistical digital signal processing
and modeling. John Wiley & Sons.
- [2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,
análise estatística e novas estratégias de algoritmos LMS de passo variável.
- [3] Wikipedia entry on Least Mean Squares
-
least_mean_squares_normalized_leaky
(psi, y)[source]¶ Parameter estimation using the Normalized Leaky LMS filter.
When the leakage factor, gama, is set to 0 then there is no leakage in the estimation process.
- Parameters
psi (ndarray of floats) – The information matrix of the model.
y_train (array-like of shape = y_training) – The data used to training the model.
- Returns
theta – The estimated parameters of the model.
- Return type
array-like of shape = number_of_model_elements
Notes
A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.
References
- [1] Book: Hayes, M. H. (2009). Statistical digital signal processing
and modeling. John Wiley & Sons.
- [2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,
análise estatística e novas estratégias de algoritmos LMS de passo variável.
- [3] Wikipedia entry on Least Mean Squares
-
least_mean_squares_leaky
(psi, y)[source]¶ Parameter estimation using the Leaky LMS filter.
When the leakage factor, gama, is set to 0 then there is no leakage in the estimation process.
- Parameters
psi (ndarray of floats) – The information matrix of the model.
y_train (array-like of shape = y_training) – The data used to training the model.
- Returns
theta – The estimated parameters of the model.
- Return type
array-like of shape = number_of_model_elements
Notes
A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.
References
- [1] Book: Hayes, M. H. (2009). Statistical digital signal processing
and modeling. John Wiley & Sons.
- [2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,
análise estatística e novas estratégias de algoritmos LMS de passo variável.
- [3] Wikipedia entry on Least Mean Squares
-
least_mean_squares_fourth
(psi, y)[source]¶ Parameter estimation using the LMS Fourth filter.
When the leakage factor, gama, is set to 0 then there is no leakage in the estimation process.
- Parameters
psi (ndarray of floats) – The information matrix of the model.
y_train (array-like of shape = y_training) – The data used to training the model.
- Returns
theta – The estimated parameters of the model.
- Return type
array-like of shape = number_of_model_elements
Notes
A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.
References
- [1] Book: Hayes, M. H. (2009). Statistical digital signal processing
and modeling. John Wiley & Sons.
- [2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,
análise estatística e novas estratégias de algoritmos LMS de passo variável.
- [3] Manuscript:Gui, G., Mehbodniya, A., & Adachi, F. (2013).
Least mean square/fourth algorithm with application to sparse channel estimation. arXiv preprint arXiv:1304.3911. https://arxiv.org/pdf/1304.3911.pdf
- [4] Manuscript: Nascimento, V. H., & Bermudez, J. C. M. (2005, March).
When is the least-mean fourth algorithm mean-square stable? In Proceedings.(ICASSP’05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. (Vol. 4, pp. iv-341). IEEE. http://www.lps.usp.br/vitor/artigos/icassp05.pdf
- [5] Wikipedia entry on Least Mean Squares
-
least_mean_squares_mixed_norm
(psi, y)[source]¶ Parameter estimation using the Mixed-norm LMS filter.
The weight factor controls the proportions of the error norms and offers an extra degree of freedom within the adaptation.
- Parameters
psi (ndarray of floats) – The information matrix of the model.
y_train (array-like of shape = y_training) – The data used to training the model.
- Returns
theta – The estimated parameters of the model.
- Return type
array-like of shape = number_of_model_elements
Notes
A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.
References
- [1] Chambers, J. A., Tanrikulu, O., & Constantinides, A. G. (1994).
Least mean mixed-norm adaptive filtering. Electronics letters, 30(19), 1574-1575. https://ieeexplore.ieee.org/document/326382
- [2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,
análise estatística e novas estratégias de algoritmos LMS de passo variável.
- [3] Wikipedia entry on Least Mean Squares
-
sysidentpy utils¶
Utilities fo data validation
-
sysidentpy.utils._check_arrays.
check_infinity
(X, y)[source]¶ Check that X and y have no NaN or Inf samples.
If there is any NaN or Inf samples a ValueError is raised.
- Parameters
X (ndarray of floats) – The input data.
y (ndarray of floats) – The output data.
-
sysidentpy.utils._check_arrays.
check_nan
(X, y)[source]¶ Check that X and y have no NaN or Inf samples.
If there is any NaN or Inf samples a ValueError is raised.
- Parameters
X (ndarray of floats) – The input data.
y (ndarray of floats) – The output data.
-
sysidentpy.utils._check_arrays.
check_length
(X, y)[source]¶ Check that X and y have the same number of samples.
If the length of X and y are different a ValueError is raised.
- Parameters
X (ndarray of floats) – The input data.
y (ndarray of floats) – The output data.
sysidentpy generate data¶
Utilities for data generation
-
sysidentpy.utils.generate_data.
get_siso_data
(n=5000, colored_noise=False, sigma=0.05, train_percentage=90)[source]¶ Perform the Error Reduction Ration algorithm.
- Parameters
n (int) – The number of samples.
colored_noise (bool) – Select white noise or colored noise (autoregressive noise).
sigma (float) – The standard deviation of the random distribution to generate the noise.
train_percentage (int) – The percentage of the data to be used as train data.
- Returns
x_train, x_valid (array-like) – The input data to be used in identification and validation, respectively.
y_train, y_valid (array-like) – The output data to be used in identification and validation, respectively.
-
sysidentpy.utils.generate_data.
get_miso_data
(n=5000, colored_noise=False, sigma=0.05, train_percentage=90)[source]¶ Perform the Error Reduction Ration algorithm.
- Parameters
n (int) – The number of samples.
colored_noise (bool) – Select white noise or colored noise (autoregressive noise).
sigma (float) – The standard deviation of the random distribution to generate the noise.
train_percentage (int) – The percentage of the data to be used as train data.
- Returns
x_train, x_valid (array-like) – The input data to be used in identification and validation, respectively.
y_train, y_valid (array-like) – The output data to be used in identification and validation, respectively.