Version 0.4.2

Changelog

Bug fixes

Version 0.4

October, 2018

Warning

Version 0.4 is the last version of imbalanced-learn to support Python 2.7 and Python 3.4. Imbalanced-learn 0.5 will require Python 3.5 or higher.

Highlights

This release brings its set of new feature as well as some API changes to strengthen the foundation of imbalanced-learn.

As new feature, 2 new modules imblearn.keras and imblearn.tensorflow have been added in which imbalanced-learn samplers can be used to generate balanced mini-batches.

The module imblearn.ensemble has been consolidated with new classifier: imblearn.ensemble.BalancedRandomForestClassifier, imblearn.ensemble.EasyEnsembleClassifier, imblearn.ensemble.RUSBoostClassifier.

Support for string has been added in imblearn.over_sampling.RandomOverSampler and imblearn.under_sampling.RandomUnderSampler. In addition, a new class imblearn.over_sampling.SMOTENC allows to generate sample with data sets containing both continuous and categorical features.

The imblearn.over_sampling.SMOTE has been simplified and break down to 2 additional classes: imblearn.over_sampling.SVMSMOTE and imblearn.over_sampling.BorderlineSMOTE.

There is also some changes regarding the API: the parameter sampling_strategy has been introduced to replace the ratio parameter. In addition, the return_indices argument has been deprecated and all samplers will exposed a sample_indices_ whenever this is possible.

Changelog

API

  • Replace the parameter ratio by sampling_strategy. #411 by Guillaume Lemaitre.
  • Enable to use a float with binary classification for sampling_strategy. #411 by Guillaume Lemaitre.
  • Enable to use a list for the cleaning methods to specify the class to sample. #411 by Guillaume Lemaitre.
  • Replace fit_sample by fit_resample. An alias is still available for backward compatibility. In addition, sample has been removed to avoid resampling on different set of data. #462 by Guillaume Lemaitre.

New features

Enhancement

Bug fixes

  • Fix bug in metrics.classification_report_imbalanced for which y_pred and y_true where inversed. #394 by @Ole Silvig <klizter>.
  • Fix bug in ADASYN to consider only samples from the current class when generating new samples. #354 by Guillaume Lemaitre.
  • Fix bug which allow for sorted behavior of sampling_strategy dictionary and thus to obtain a deterministic results when using the same random state. #447 by Guillaume Lemaitre.
  • Force to clone scikit-learn estimator passed as attributes to samplers. #446 by Guillaume Lemaitre.
  • Fix bug which was not preserving the dtype of X and y when generating samples. #450 by Guillaume Lemaitre.
  • Add the option to pass a Memory object to make_pipeline like in pipeline.Pipeline class. #458 by Christos Aridas.

Maintenance

Documentation

Deprecation