`neuroharmony`.Neuroharmony¶

class neuroharmony.Neuroharmony(features, regression_features, covariates, eliminate_variance, estimator=RandomForestRegressor(), scaler=StandardScaler(), decomposition=PCA(), model_strategy='single', param_distributions={'RandomForestRegressor__criterion': ['mse', 'mae'], 'RandomForestRegressor__n_estimators': [100, 200, 500], 'RandomForestRegressor__warm_start': [False, True]}, estimator_args={'criterion': 'mae', 'n_jobs': 1, 'random_state': 42, 'verbose': False}, scaler_args={}, randomized_search_args={}, pipeline_args={})[source]¶

Harmonization tool to mitigate scanner bias.

Parameters

featureslist: Target features to be harmonized, for example, ROIs.
regression_featureslist: Features used to derive harmonization rules, for example, IQMs.
covariateslist: Variables for which we want to eliminate the bias, for example, age, sex, and scanner.
estimatorsklearn estimator, default=RandomForestRegressor(): Model to make the harmonization regression.
scalersklearn scaler, default=StandardScaler(): Scaler used as the first step of the harmonization regression.
model_strategy{“single”, “full”}, default=”single”: If “single” one model will be trained for each single feature in features. If “full” it will use a single model to regress all the features at once.
param_distributionsdict, default=dict(RandomForestRegressor__n_estimators=[100, 200, 500],: RandomForestRegressor__warm_start=[False, True], )

Distribution of parameters to be testes on the RandomizedSearchCV.
**estimator_argsdict: Parameters for the estimator.
**scaler_argsdict: Parameters for the scaler.
**randomized_search_argsdict: Parameters for the RandomizedSearchCV. See https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RandomizedSearchCV.html
**pipeline_argsdict: Parameters for the sklearn Pipeline. See https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html

Attributes

X_harmonized_NDFrame [n_subjects, n_features]: Input data harmonized.
leaveonegroupout_: Leave One Group Out cross-validator.
models_by_feature_: Estimators by features.

fit(df)[source]¶

Fit the model.

Fit all the transforms one after the other and transform the data, then fit the transformed data using the final estimator.

Parameters

dfNDFrame of shape [n_subjects, n_features]: Training data. Must fulfil input requirements of the first step of the pipeline.

Returns

selfNeuroharmony: This estimator

refit(df)[source]¶

Fit a trained model with a new dataset.

Parameters

dfNDFrame of shape [n_samples, n_features]: Pandas dataframe with features, regression_features and covariates.

fit_transform(df)[source]¶

Fit to data, then transform it.

Fits transformer to df and y with optional parameters fit_params and returns a transformed version of df.

Parameters

df: NDFrame of shape [n_subjects, n_features]: Training set.

Returns

harmonized_: NDFrame of shape [n_samples, n_features_new]: Data harmonized with ComBat.

transform(df)[source]¶

Predict regression target for df.

The predicted regression target of an input sample is computed as the mean predicted regression targets of the trees in the forest.

Parameters

dfNDFrame of shape [n_samples, n_features]: Pandas dataframe with features, regression_features and covariates.

Returns

yNDFrame of shape [n_samples, n_features]: Data harmonized with Neuroharmony.

neuroharmony.Neuroharmony¶

`neuroharmony`.Neuroharmony¶