funcnet.coupling_analysis¶
Provides classes for analyzing spatially embedded complex networks, handling multivariate data. Written by Jakob Runge.
-
class
pyunicorn.funcnet.coupling_analysis.
CouplingAnalysis
(data, silence_level=0)[source]¶ Bases:
object
Contains methods to calculate coupling matrices from large arrays of scalar time series. Comprises linear and information-theoretic measures, lagged and directed couplings.
-
__init__
(data, silence_level=0)[source]¶ Initialize an instance of CouplingAnalysis from data array.
Parameters: - data (multidimensional numpy array) – The time series array with time in first dimension.
- silence_level (int >= 0) – The higher, the less progress info is output.
-
__weakref__
¶ list of weak references to the object (if defined)
-
static
_get_nearest_neighbors
(array, xyz, k, standardize=True)[source]¶ Returns nearest-neighbors for conditional mutual information estimator.
Reference: [Kraskov2004]
Parameters: - array (array (float)) – data array.
- xyz (array [int(0|1|2)]) – identifier of X, Y, Z in CMI
- k (int [int>=1]) – nearest-neighbor MI estimation parameter.
- standardize (bool) – standardize array before estimation. (default: True)
Return type: tuple of arrays
Returns: nearest neighbors for each sample point.
-
static
_par_corr_to_cmi
(par_corr)[source]¶ Transformation of partial correlation to conditional mutual information scale using the (multivariate) Gaussian assumption.
Parameters: par_corr (float or array) – partial correlation Return type: float Returns: transformed partial correlation.
-
static
_quantile_bin_array
(array, bins=6)[source]¶ Returns symbolified array with aequi-quantile binning.
This partition results in a uniform distribution of the marginals.
Parameters: - array (array) – data
- bins (int) – number of bins
Return type: array
Returns: converted data
-
static
bincount_hist
(symb_array)[source]¶ Computes histogram from symbolic array.
Parameters: symb_array (array of integers) – symbolic data Return type: array Returns: (unnormalized) histogram
-
static
create_plogp
(T)[source]¶ Precalculation of p*log(p) needed for entropies.
Parameters: T (int) – sample length Return type: array Returns: p*log(p) array from p=1 to p=T
-
cross_correlation
(tau_max=0, lag_mode='max')[source]¶ Return cross correlation between all pairs of nodes.
Two lag-modes are available (default: lag_mode=’max’):
lag_mode = ‘all’: Return 3-dimensional array of lagged cross correlations between all pairs of nodes. An entry \((i, j, \tau)\) corresponds to \(\rho(X^i_t-\tau, X^j_t)\) for positive lags tau, i.e., the direction i –> j for \(\tau \ne 0\).
lag_mode = ‘max’: Return matrix of absolute maxima and corresponding lags of lagged cross correlation (CC) between all pairs of nodes. Returns two usually asymmetric matrices of CC values and lags: In each matrix, an entry \((i, j)\) corresponds to the (positive or negative) value and lag, respectively, at absolute maximum of \(\rho(X^i_t-\tau, X^j_t)\) for positive lags tau, i.e., the direction i –> j for \(\tau > 0\). The matrices are, thus, asymmetric. The function
symmetrize_by_absmax()
can be used to obtain a symmetric matrix.Example:
>>> coup_ana = CouplingAnalysis(CouplingAnalysis.test_data()) >>> similarity_matrix, lag_matrix = coup_ana.cross_correlation( ... tau_max=5, lag_mode='max') >>> r((similarity_matrix, lag_matrix)) (array([[ 1. , 0.757 , 0.779 , 0.7536], [ 0.4847, 1. , 0.4502, 0.5197], [ 0.6219, 0.5844, 1. , 0.5992], [ 0.4827, 0.5509, 0.4996, 1. ]]), array([[0, 4, 1, 2], [0, 0, 0, 0], [0, 3, 0, 1], [0, 2, 0, 0]]))
Parameters: - tau_max (int [int>=0]) – maximum lag of cross correlation lag function.
- lag_mode (str [(‘max’|’all’)]) – lag-mode of cross correlations to return.
Return type: 3D-array or tuple of matrices
Returns: all-lag array or matrices of value and lag at the absolute maximum.
-
information_transfer
(tau_max=0, estimator='knn', knn=10, past=1, cond_mode='ity', lag_mode='max')[source]¶ Return bivariate information transfer between all pairs of nodes.
Two condition modes of information transfer are available as described in [Runge2012b].
- Information transfer to Y (ITY):
- \[I(X^i_t-\tau, X^j_t | X^j_t-1, ...,X^j_t-past)\]
- Momentary information transfer (MIT):
- \[I(X^i_t-\tau, X^j_t | X^j_t-1, ...,X^j_t-past, X^i_t-\tau-1, ...,X^j_t-\tau-past)\]
Two estimators are available:
estimator = ‘knn’ (Recommended): Based on k-nearest-neighbors [Kraskov2004], version 1 in their paper. Larger k have smaller variance, but larger (typically negative) bias, and vice versa.
estimator = ‘gauss’: Captures only linear part of association. Essentially estimates a transformed partial correlation.
Two lag-modes are available (default: lag_mode=’max’):
lag_mode = ‘all’: Return 3-dimensional array of lag-functions between all pairs of nodes. An entry \((i, j, \tau)\) corresponds to \(I(X^i_t-\tau, X^j_t | ...)\) for positive lags tau, i.e., the direction i –> j for \(\tau \ne 0\).
lag_mode = ‘max’: Return matrix of absolute maxima and corresponding lags of lag-functions between all pairs of nodes. Returns two usually asymmetric matrices of values and lags: In each matrix, an entry \((i, j)\) corresponds to the value and lag, respectively, at absolute maximum of \(I(X^i_t-\tau, X^j_t | ...)\) for positive lags tau, i.e., the direction i –> j for \(\tau > 0\). The matrices are, thus, asymmetric. The function
symmetrize_by_absmax()
can be used to obtain a symmetric matrix.Example:
>>> coup_ana = CouplingAnalysis(CouplingAnalysis.test_data()) >>> similarity_matrix, lag_matrix = coup_ana.information_transfer( ... tau_max=5, estimator='knn', knn=10) >>> r((similarity_matrix, lag_matrix)) (array([[ 0. , 0.1544, 0.3261, 0.3047], [ 0.0218, 0. , 0.0394, 0.0976], [ 0.0134, 0.0663, 0. , 0.1502], [ 0.0066, 0.0694, 0.0401, 0. ]]), array([[0, 2, 1, 2], [5, 0, 0, 0], [5, 1, 0, 1], [5, 0, 0, 0]]))
Parameters: - tau_max (int [int>=0]) – maximum lag of ITY lag function.
- past (int [int>=1]) – maximum lag of past history.
- knn (int [int>=1]) – nearest-neighbor ITY estimation parameter. (default: 10)
- bins (int [int>=2]) – binning ITY estimation parameter. (default: 6)
- estimator (str [(‘knn’|’gauss’)]) – ITY estimator. (default: ‘knn’)
- cond_mode (str [(‘ity’|’mit’)]) – condition mode. (default: ‘ity’)
- lag_mode (str [(‘max’|’all’)]) – lag-mode of ITY to return.
Return type: 3D-array or tuple of matrices
Returns: all-lag array or matrices of value and lag at the absolute maximum.
-
mutual_information
(tau_max=0, estimator='knn', knn=10, bins=6, lag_mode='max')[source]¶ Return mutual information (MI) between all pairs of nodes.
Three estimators are available:
estimator = ‘knn’ (Recommended): Based on k-nearest-neighbors [Kraskov2004], version 1 in their paper. Larger k have smaller variance, but larger (typically negative) bias, and vice versa.
estimator = ‘binning’: Binning estimator based on equal-quantile binning.
estimator = ‘gauss’: Captures only linear part of association. Essentially estimates a transformed partial correlation.
Two lag-modes are available (default: lag_mode=’max’):
lag_mode = ‘all’: Return 3-dimensional array of lagged MI between all pairs of nodes. An entry \((i, j, \tau)\) corresponds to \(I(X^i_t-\tau, X^j_t)\) for positive lags tau, i.e., the direction i –> j for \(\tau \ne 0\).
lag_mode = ‘max’: Return matrix of absolute maxima and corresponding lags of lagged MI between all pairs of nodes. Returns two usually asymmetric matrices of MI values and lags: In each matrix, an entry \((i, j)\) corresponds to the value and lag, respectively, at absolute maximum of \(I(X^i_t-\tau, X^j_t)\) for positive lags tau, i.e., the direction i –> j for \(\tau > 0\). The matrices are, thus, asymmetric. The function
symmetrize_by_absmax()
can be used to obtain a symmetric matrix.Reference: [Kraskov2004]
Example:
>>> coup_ana = CouplingAnalysis(CouplingAnalysis.test_data()) >>> similarity_matrix, lag_matrix = coup_ana.mutual_information( ... tau_max=5, knn=10, estimator='knn') >>> similarity_matrix, lag_matrix (array([[ 4.65048742, 0.43874303, 0.46520019, 0.41257444], [ 0.14704162, 4.65048742, 0.10645443, 0.16393046], [ 0.24829103, 0.2125767 , 4.65048742, 0.22044939], [ 0.12093173, 0.19902836, 0.14530452, 4.65048742]], dtype=float32), array([[0, 4, 1, 2], [0, 0, 0, 0], [0, 2, 0, 1], [0, 2, 0, 0]], dtype=int8))
Parameters: - tau_max (int [int>=0]) – maximum lag of MI lag function.
- knn (int [int>=1]) – nearest-neighbor MI estimation parameter. (default: 10)
- bins (int [int>=2]) – binning MI estimation parameter. (default: 6)
- estimator (str [(‘knn’|’binning’|’gauss’)]) – MI estimator. (default: ‘knn’)
- lag_mode (str [(‘max’|’all’)]) – lag-mode of MI to return.
Return type: 3D-array or tuple of matrices
Returns: all-lag array or matrices of value and lag at the absolute maximum.
-
silence_level
= None¶ (int>=0) higher -> less progress info
-
symmetrize_by_absmax
(similarity_matrix, lag_matrix)[source]¶ Returns symmetrized similarity matrix.
Computes the largest absolute value for each pair (i,j) and (j,i) and returns the in-place changed matrices of measures and lags. A negative lag for an entry (i,j) in the lag_matrix then indicates a ‘direction’ j –> i regarding the peak of the lag function, and vice versa for a positive lag.
Example:
>>> coup_ana = CouplingAnalysis(CouplingAnalysis.test_data()) >>> similarity_matrix, lag_matrix = coup_ana.cross_correlation( ... tau_max=2) >>> r((similarity_matrix, lag_matrix)) (array([[ 1. , 0.698 , 0.7788, 0.7535], [ 0.4848, 1. , 0.4507, 0.52 ], [ 0.6219, 0.5704, 1. , 0.5996], [ 0.4833, 0.5503, 0.5002, 1. ]]), array([[0, 2, 1, 2], [0, 0, 0, 0], [0, 2, 0, 1], [0, 2, 0, 0]])) >>> r(coup_ana.symmetrize_by_absmax(similarity_matrix, lag_matrix)) (array([[ 1. , 0.698 , 0.7788, 0.7535], [ 0.698 , 1. , 0.5704, 0.5503], [ 0.7788, 0.5704, 1. , 0.5996], [ 0.7535, 0.5503, 0.5996, 1. ]]), array([[ 0, 2, 1, 2], [-2, 0, -2, -2], [-1, 2, 0, 1], [-2, 2, -1, 0]]))
Parameters: - similarity_matrix (array-like [float]) – array-like [node, node] matrix of similarity estimates
- lag_matrix (array-like [int>=0]) – array-like [node, node] matrix of lags
Return type: tuple of arrays
Returns: the value at the absolute maximum and the (pos or neg) lag.
-