classification Package¶
AdvancedSampler
Module¶
-
class
WORC.classification.AdvancedSampler.
AdvancedSampler
(param_distributions, n_iter, random_state=None, method='Halton')[source]¶ Bases:
object
Generator on parameters sampled from given distributions using numerical sequences. Based on the sklearn ParameterSampler.
Non-deterministic iterable over random candidate combinations for hyper- parameter search. If all parameters are presented as a list, sampling without replacement is performed. If at least one parameter is given as a distribution, sampling with replacement is used. It is highly recommended to use continuous distributions for continuous parameters.
Note that before SciPy 0.16, the
scipy.stats.distributions
do not accept a custom RNG instance and always use the singleton RNG fromnumpy.random
. Hence settingrandom_state
will not guarantee a deterministic iteration wheneverscipy.stats
distributions are used to define the parameter search space. Deterministic behavior is however guaranteed from SciPy 0.16 onwards.Read more in the User Guide.
- param_distributionsdict
Dictionary where the keys are parameters and values are distributions from which a parameter is to be sampled. Distributions either have to provide a
rvs
function to sample from them, or can be given as a list of values, where a uniform distribution is assumed.- n_iterinteger
Number of parameter settings that are produced.
- random_stateint or RandomState
Pseudo random number generator state used for random uniform sampling from lists of possible values instead of scipy.stats distributions.
- paramsdict of string to any
Yields dictionaries mapping each estimator parameter to as sampled value.
>>> from WORC.classification.AdvancedSampler import HaltonSampler >>> from scipy.stats.distributions import expon >>> import numpy as np >>> np.random.seed(0) >>> param_grid = {'a':[1, 2], 'b': expon()} >>> param_list = list(HaltonSampler(param_grid, n_iter=4)) >>> rounded_list = [dict((k, round(v, 6)) for (k, v) in d.items()) ... for d in param_list] >>> rounded_list == [{'b': 0.89856, 'a': 1}, ... {'b': 0.923223, 'a': 1}, ... {'b': 1.878964, 'a': 2}, ... {'b': 1.038159, 'a': 2}] True
-
__dict__
= mappingproxy({'__module__': 'WORC.classification.AdvancedSampler', '__doc__': "Generator on parameters sampled from given distributions using\n numerical sequences. Based on the sklearn ParameterSampler.\n\n Non-deterministic iterable over random candidate combinations for hyper-\n parameter search. If all parameters are presented as a list,\n sampling without replacement is performed. If at least one parameter\n is given as a distribution, sampling with replacement is used.\n It is highly recommended to use continuous distributions for continuous\n parameters.\n\n Note that before SciPy 0.16, the ``scipy.stats.distributions`` do not\n accept a custom RNG instance and always use the singleton RNG from\n ``numpy.random``. Hence setting ``random_state`` will not guarantee a\n deterministic iteration whenever ``scipy.stats`` distributions are used to\n define the parameter search space. Deterministic behavior is however\n guaranteed from SciPy 0.16 onwards.\n\n Read more in the :ref:`User Guide <search>`.\n\n Parameters\n ----------\n param_distributions : dict\n Dictionary where the keys are parameters and values\n are distributions from which a parameter is to be sampled.\n Distributions either have to provide a ``rvs`` function\n to sample from them, or can be given as a list of values,\n where a uniform distribution is assumed.\n\n n_iter : integer\n Number of parameter settings that are produced.\n\n random_state : int or RandomState\n Pseudo random number generator state used for random uniform sampling\n from lists of possible values instead of scipy.stats distributions.\n\n Returns\n -------\n params : dict of string to any\n **Yields** dictionaries mapping each estimator parameter to\n as sampled value.\n\n Examples\n --------\n >>> from WORC.classification.AdvancedSampler import HaltonSampler\n >>> from scipy.stats.distributions import expon\n >>> import numpy as np\n >>> np.random.seed(0)\n >>> param_grid = {'a':[1, 2], 'b': expon()}\n >>> param_list = list(HaltonSampler(param_grid, n_iter=4))\n >>> rounded_list = [dict((k, round(v, 6)) for (k, v) in d.items())\n ... for d in param_list]\n >>> rounded_list == [{'b': 0.89856, 'a': 1},\n ... {'b': 0.923223, 'a': 1},\n ... {'b': 1.878964, 'a': 2},\n ... {'b': 1.038159, 'a': 2}]\n True\n ", '__init__': <function AdvancedSampler.__init__>, '__iter__': <function AdvancedSampler.__iter__>, '__len__': <function AdvancedSampler.__len__>, '__dict__': <attribute '__dict__' of 'AdvancedSampler' objects>, '__weakref__': <attribute '__weakref__' of 'AdvancedSampler' objects>})¶
-
__init__
(param_distributions, n_iter, random_state=None, method='Halton')[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
__module__
= 'WORC.classification.AdvancedSampler'¶
-
__weakref__
¶ list of weak references to the object (if defined)
-
class
WORC.classification.AdvancedSampler.
discrete_uniform
(loc=-1, scale=0)[source]¶ Bases:
object
-
__dict__
= mappingproxy({'__module__': 'WORC.classification.AdvancedSampler', '__init__': <function discrete_uniform.__init__>, 'rvs': <function discrete_uniform.rvs>, '__dict__': <attribute '__dict__' of 'discrete_uniform' objects>, '__weakref__': <attribute '__weakref__' of 'discrete_uniform' objects>, '__doc__': None})¶
-
__module__
= 'WORC.classification.AdvancedSampler'¶
-
__weakref__
¶ list of weak references to the object (if defined)
-
-
class
WORC.classification.AdvancedSampler.
exp_uniform
(loc=-1, scale=0, base=2.718281828459045)[source]¶ Bases:
object
-
__dict__
= mappingproxy({'__module__': 'WORC.classification.AdvancedSampler', '__init__': <function exp_uniform.__init__>, 'rvs': <function exp_uniform.rvs>, '__dict__': <attribute '__dict__' of 'exp_uniform' objects>, '__weakref__': <attribute '__weakref__' of 'exp_uniform' objects>, '__doc__': None})¶
-
__init__
(loc=-1, scale=0, base=2.718281828459045)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
__module__
= 'WORC.classification.AdvancedSampler'¶
-
__weakref__
¶ list of weak references to the object (if defined)
-
-
class
WORC.classification.AdvancedSampler.
log_uniform
(loc=-1, scale=0, base=10)[source]¶ Bases:
object
-
__dict__
= mappingproxy({'__module__': 'WORC.classification.AdvancedSampler', '__init__': <function log_uniform.__init__>, 'rvs': <function log_uniform.rvs>, '__dict__': <attribute '__dict__' of 'log_uniform' objects>, '__weakref__': <attribute '__weakref__' of 'log_uniform' objects>, '__doc__': None})¶
-
__init__
(loc=-1, scale=0, base=10)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
__module__
= 'WORC.classification.AdvancedSampler'¶
-
__weakref__
¶ list of weak references to the object (if defined)
-
RankedSVM
Module¶
-
WORC.classification.RankedSVM.
RankSVM_test
(test_data, num_class, Weights, Bias, SVs, svm='Poly', gamma=0.05, coefficient=0.05, degree=3)[source]¶
-
WORC.classification.RankedSVM.
RankSVM_test_original
(test_data, test_target, Weights, Bias, SVs, svm='Poly', gamma=0.05, coefficient=0.05, degree=3)[source]¶
-
WORC.classification.RankedSVM.
RankSVM_train
(train_data, train_target, cost=1, lambda_tol=1e-06, norm_tol=0.0001, max_iter=500, svm='Poly', gamma=0.05, coefficient=0.05, degree=3)[source]¶
-
WORC.classification.RankedSVM.
RankSVM_train_old
(train_data, train_target, cost=1, lambda_tol=1e-06, norm_tol=0.0001, max_iter=500, svm='Poly', gamma=0.05, coefficient=0.05, degree=3)[source]¶ Weights,Bias,SVs = RankSVM_train(train_data,train_target,cost,lambda_tol,norm_tol,max_iter,svm,gamma,coefficient,degree)
Description
- RankSVM_train takes,
train_data - An MxN array, the ith instance of training instance is stored in train_data[i,:] train_target - A QxM array, if the ith training instance belongs to the jth class, then train_target[j,i] equals +1, otherwise train_target(j,i) equals -1
- svm - svm gives the type of svm used in training, which can take the value of ‘RBF’, ‘Poly’ or ‘Linear’; svm.para gives the corresponding parameters used for the svm:
if svm is ‘RBF’, then gamma gives the value of gamma, where the kernel is exp(-Gamma*|x[i]-x[j]|^2)
if svm is ‘Poly’, then three values are used gamma, coefficient, and degree respectively, where the kernel is (gamma*<x[i],x[j]>+coefficient)^degree.
if svm is ‘Linear’, then svm is [].
cost - The value of ‘C’ used in the SVM, default=1 lambda_tol - The tolerance value for lambda described in the appendix of [1]; default value is 1e-6 norm_tol - The tolerance value for difference between alpha(p+1) and alpha(p) described in the appendix of [1]; default value is 1e-4 max_iter - The maximum number of iterations for RankSVM, default=500
- and returns,
Weights - The value for beta[ki] as described in the appendix of [1] is stored in Weights[k,i] Bias - The value for b[i] as described in the appendix of [1] is stored in Bias[1,i] SVs - The ith support vector is stored in SVs[:,i]
For more details,please refer to [1] and [2].
SearchCV
Module¶
construct_classifier
Module¶
-
WORC.classification.construct_classifier.
construct_SVM
(config, regression=False)[source]¶ Constructs a SVM classifier
- Args:
config (dict): Dictionary of the required config settings features (pandas dataframe): A pandas dataframe containing the features
to be used for classification
- Returns:
SVM/SVR classifier, parameter grid
-
WORC.classification.construct_classifier.
construct_classifier
(config)[source]¶ Interface to create classification
Different classifications can be created using this common interface
- config: dict, mandatory
Contains the required config settings. See the Github Wiki for all available fields.
- Returns:
Constructed classifier
crossval
Module¶
estimators
Module¶
-
class
WORC.classification.estimators.
RankedSVM
(cost=1, lambda_tol=1e-06, norm_tol=0.0001, max_iter=500, svm='Poly', gamma=0.05, coefficient=0.05, degree=3)[source]¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.ClassifierMixin
An example classifier which implements a 1-NN algorithm.
- demo_paramstr, optional
A parameter used for demonstation of how to pass and store paramters.
- X_array, shape = [n_samples, n_features]
The input passed during
fit()
- y_array, shape = [n_samples]
The labels passed during
fit()
-
__init__
(cost=1, lambda_tol=1e-06, norm_tol=0.0001, max_iter=500, svm='Poly', gamma=0.05, coefficient=0.05, degree=3)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
__module__
= 'WORC.classification.estimators'¶
-
fit
(X, y)[source]¶ A reference implementation of a fitting function for a classifier.
- Xarray-like, shape = [n_samples, n_features]
The training input samples.
- yarray-like, shape = [n_samples]
The target values. An array of int.
- selfobject
Returns self.
fitandscore
Module¶
metrics
Module¶
-
WORC.classification.metrics.
ICC
(M, ICCtype='inter')[source]¶ - Input:
M is matrix of observations. Rows: patients, columns: observers. type: ICC type, currently “inter” or “intra”.
-
WORC.classification.metrics.
ICC_anova
(Y, ICCtype='inter', more=False)[source]¶ Adopted from Nipype with a slight alteration to distinguish inter and intra. the data Y are entered as a ‘table’ ie subjects are in rows and repeated measures in columns One Sample Repeated measure ANOVA Y = XB + E with X = [FaTor / Subjects]
-
WORC.classification.metrics.
check_scoring
(estimator, scoring=None, allow_none=False)[source]¶ Surrogate for sklearn’s check_scoring to enable use of some other scoring metrics.
-
WORC.classification.metrics.
performance_multilabel
(y_truth, y_prediction, y_score=None, beta=1)[source]¶ Multiclass performance metrics.
y_truth and y_prediction should both be lists with the multiclass label of each object, e.g.
y_truth = [0, 0, 0, 0, 0, 0, 2, 2, 1, 1, 2] ### Groundtruth y_prediction = [0, 0, 0, 0, 0, 0, 1, 2, 1, 2, 2] ### Predicted labels
Calculation of accuracy accorading to formula suggested in CAD Dementia Grand Challege http://caddementia.grand-challenge.org Calculation of Multi Class AUC according to classpy: https://bitbucket.org/bigr_erasmusmc/classpy/src/master/classpy/multi_class_auc.py