neuroharmony.Neuroharmony

class neuroharmony.Neuroharmony(features, regression_features, covariates, eliminate_variance, estimator=RandomForestRegressor(), scaler=StandardScaler(), decomposition=PCA(), model_strategy='single', param_distributions={'RandomForestRegressor__criterion': ['mse', 'mae'], 'RandomForestRegressor__n_estimators': [100, 200, 500], 'RandomForestRegressor__warm_start': [False, True]}, estimator_args={'criterion': 'mae', 'n_jobs': 1, 'random_state': 42, 'verbose': False}, scaler_args={}, randomized_search_args={}, pipeline_args={})[source]

Harmonization tool to mitigate scanner bias.

Parameters
featureslist

Target features to be harmonized, for example, ROIs.

regression_featureslist

Features used to derive harmonization rules, for example, IQMs.

covariateslist

Variables for which we want to eliminate the bias, for example, age, sex, and scanner.

estimatorsklearn estimator, default=RandomForestRegressor()

Model to make the harmonization regression.

scalersklearn scaler, default=StandardScaler()

Scaler used as the first step of the harmonization regression.

model_strategy{“single”, “full”}, default=”single”

If “single” one model will be trained for each single feature in features. If “full” it will use a single model to regress all the features at once.

param_distributionsdict, default=dict(RandomForestRegressor__n_estimators=[100, 200, 500],

RandomForestRegressor__warm_start=[False, True], )

Distribution of parameters to be testes on the RandomizedSearchCV.

**estimator_argsdict

Parameters for the estimator.

**scaler_argsdict

Parameters for the scaler.

**randomized_search_argsdict

Parameters for the RandomizedSearchCV. See https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RandomizedSearchCV.html

**pipeline_argsdict

Parameters for the sklearn Pipeline. See https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html

Attributes
X_harmonized_NDFrame [n_subjects, n_features]

Input data harmonized.

leaveonegroupout_

Leave One Group Out cross-validator.

models_by_feature_

Estimators by features.

fit(df)[source]

Fit the model.

Fit all the transforms one after the other and transform the data, then fit the transformed data using the final estimator.

Parameters
dfNDFrame of shape [n_subjects, n_features]

Training data. Must fulfil input requirements of the first step of the pipeline.

Returns
selfNeuroharmony

This estimator

refit(df)[source]

Fit a trained model with a new dataset.

Parameters
dfNDFrame of shape [n_samples, n_features]

Pandas dataframe with features, regression_features and covariates.

fit_transform(df)[source]

Fit to data, then transform it.

Fits transformer to df and y with optional parameters fit_params and returns a transformed version of df.

Parameters
df: NDFrame of shape [n_subjects, n_features]

Training set.

Returns
harmonized_: NDFrame of shape [n_samples, n_features_new]

Data harmonized with ComBat.

transform(df)[source]

Predict regression target for df.

The predicted regression target of an input sample is computed as the mean predicted regression targets of the trees in the forest.

Parameters
dfNDFrame of shape [n_samples, n_features]

Pandas dataframe with features, regression_features and covariates.

Returns
yNDFrame of shape [n_samples, n_features]

Data harmonized with Neuroharmony.