neuroharmony.ks_test_grid¶
- neuroharmony.ks_test_grid(df, features, sampling_variable='scanner')[source]¶
Calculate the Kolmogorov-Smirnov score for all pairs of scanners.
- Parameters
- df: NDFrame of shape [n_subjects, n_features]
DataFrame with the subjects data.
- features: list
List of the features to be considered on the Kolmogorov-Smirnov test.
- sampling_variable: str, default=’scanner’
Variable for which you want to group subjects.
- Returns
- KS_by_variable: dict of NDFrames
Kolmogorov-Smirnov p-values to all pairs of instances in the sampling_variable column. The keys in the dictionary are the variables in ‘features’. The values of each entry are square NDFrames of shape [n_vars, n_vars].
- Raises
- ValueError:
If the list of variables contain any variable that is not present in df.
Examples
>>> ixi = DataSet('data/raw/IXI').data >>> features = ['Left-Lateral-Ventricle', 'Left-Inf-Lat-Vent', ] >>> KS = ks_test_grid(df, features, 'scanner') >>> KS[features[0]] +--------------------------+----------------------+------------------------+--------------------+ | | SCANNER01-SCANNER01 | SCANNER02-SCANNER01 | SCANNER03-SCANNER01| +==========================++=====================+========================+====================+ |SCANNER01-SCANNER01 | NaN | NaN | NaN | +--------------------------+----------------------+------------------------+--------------------+ |SCANNER02-SCANNER01 | 0.000759473 | NaN | NaN | +--------------------------+----------------------+------------------------+--------------------+ |SCANNER03-SCANNER01 | 0.0539998 | 0.625887 | NaN | +--------------------------+----------------------+------------------------+--------------------+