Skip to main content

Table 1 Common feature reduction approaches for supervised machine learning

From: Creating sparser prediction models of treatment outcome in depression: a proof-of-concept study using simultaneous feature selection and hyperparameter tuning

Method

Description

Examples

Evaluation

Feature selection

Intrinsic/embedded methods

Feature selection is implemented into the learning algorithm and performed during training

Regularized regression models

Decision trees

Computationally efficient

Interconnected with learning algorithm

No guarantee of optimal sparsity

Filter methods

Feature selection based on associations with target variable

Associations are calculated using, e.g., correlations or ANOVA; top N features (or N%) are retained for training

Computationally efficient

Relations between features ignored

Independent of learning algorithm

Wrapper methods

Selection of best performing subset of features

Recursive feature elimination

Sequential forward selection

Extensive search over input feature space

Interconnected with learning algorithm

Consider relations between features

Computationally expensive

Feature transformation

Projection into lower-dimensional feature space

Data are transformed and new features are created

Principal component analysis

Multidimensional scaling

Matrix factorization

Further methods of dimensionality reduction

Alternative approaches to feature selection

  1. ANOVA, analysis of variance