nkululeko.optim

Hyperparameter Optimization Module

The Nkululeko optimization module provides automated hyperparameter tuning for machine learning models. It helps you find the best parameter combinations to improve your model’s performance without manual trial and error.

Quick Start

Basic Usage

# Set PYTHONPATH if running from source
export PYTHONPATH=/path/to/nkululeko

# Run optimization
python3 -m nkululeko.optim --config your_optimization_config.ini

Minimal Example

Create a configuration file simple_optim.ini:

[EXP]
root = ./results/
name = simple_optimization
runs = 1
epochs = 1

[DATA]
databases = ['train', 'test']
train = ./data/your_data_train.csv
train.type = csv
train.split_strategy = train
test = ./data/your_data_test.csv
test.type = csv
test.split_strategy = test
target = emotion
labels = ['happy', 'sad', 'neutral']

[FEATS]
type = ['os']

[MODEL]
type = xgb

[OPTIM]
model = xgb
metric = uar
n_estimators = [50, 100]
max_depth = [3, 6]
learning_rate = [0.1, 0.2]

Run with:

python3 -m nkululeko.optim --config simple_optim.ini

Optimization Approaches

The Nkululeko optimization module supports two main approaches to hyperparameter optimization:

1. Conventional Optimization

Grid Search Exhaustive Exploration

Uses search_strategy = grid (default) or omits the strategy parameter
Tests all possible combinations of parameters
Best for: Small parameter spaces (< 100 combinations)
Guarantees: Finding the optimal combination within the defined space
Trade-off: Can be computationally expensive for large parameter spaces

When to Use Conventional:

You have specific parameter values you want to test
Parameter space is manageable (≤ 50-100 combinations)
You want guaranteed comprehensive coverage
Computational resources are not a limiting factor

2. Intelligent Optimization

Smart Search Strategies

Uses advanced algorithms: random, halving_grid, halving_random
More efficient for large parameter spaces
Uses statistical methods to find good parameters faster
Best for: Large parameter spaces (> 100 combinations)
Trade-off: May not test all combinations but finds good solutions quickly

Available Intelligent Strategies:

random: Random sampling from parameter distributions
halving_grid: Successive halving with grid search (recommended)
halving_random: Successive halving with random search

When to Use Intelligent:

Large parameter spaces (> 100 combinations)
Limited computational time or resources
High-dimensional optimization problems
You want to balance efficiency with effectiveness

Real-World Examples with Polish Emotional Speech Dataset

The following examples demonstrate both conventional and intelligent optimization approaches using a real Polish emotional speech recognition dataset:

Example 1: Conventional XGBoost Optimization

Grid Search with Intelligent Enhancement

This example demonstrates conventional optimization enhanced with halving_grid strategy for better efficiency:

[EXP]
root = ./examples/results/
name = exp_polish_optim_xgb
runs = 1
epochs = 1
random_seed = 42

[DATA]
databases = ['train', 'dev', 'test']
train = ./data/polish/polish_train.csv
train.type = csv
train.absolute_path = False
train.split_strategy = train
dev = ./data/polish/polish_dev.csv
dev.type = csv
dev.absolute_path = False
dev.split_strategy = train
test = ./data/polish/polish_test.csv
test.type = csv
test.absolute_path = False
test.split_strategy = test
target = emotion
labels = ['anger', 'neutral', 'fear']

[FEATS]
type = ['os']
scale = standard
balancing = smoteenn

[MODEL]
type = xgb
n_estimators = 50
max_depth = 9
learning_rate = 0.1
subsample = 0.9

[OPTIM]
model = xgb
search_strategy = halving_grid
n_iter = 15
cv_folds = 3
random_state = 42
n_estimators = [50, 100, 200]
max_depth = [3, 6, 9]
learning_rate = [0.01, 0.1, 0.2]
subsample = [0.8, 0.9, 1.0]
metric = uar

Key features:

Strategy: halving_grid for intelligent search with grid coverage
Parameter space: 3×3×3×3 = 81 combinations efficiently tested
Features: Standard scaling + SMOTEENN balancing for class imbalance
Metric: UAR (Unweighted Average Recall) for imbalanced emotion data
Reproducibility: Fixed random seed for consistent results

Example 2: Conventional SVM Grid Search

Pure Grid Search Approach

This example shows pure conventional grid search without intelligent enhancements:

[EXP]
root = ./examples/results/
name = exp_polish_optim_svm
runs = 1
epochs = 1

[DATA]
databases = ['train', 'dev', 'test']
train = ./data/polish/polish_train.csv
train.type = csv
train.absolute_path = False
train.split_strategy = train
dev = ./data/polish/polish_dev.csv
dev.type = csv
dev.absolute_path = False
dev.split_strategy = train
test = ./data/polish/polish_test.csv
test.type = csv
test.absolute_path = False
test.split_strategy = test
target = emotion
labels = ['anger', 'neutral', 'fear']

[FEATS]
type = ['os']
scale = robust
balancing = smoteenn

[MODEL]
type = svm
C_val = 0.1
kernel = linear

[OPTIM]
model = svm
C_val = [0.1, 1.0, 10.0, 100.0] 
kernel = ["linear", "rbf", "poly"]
gamma = ["scale", "auto", 0.001, 0.01, 0.1, 1.0]
metric = uar

Key features:

Strategy: Default grid search (no search_strategy specified)
Parameter space: 4×3×6 = 72 combinations tested exhaustively
Features: Robust scaling + SMOTEENN balancing
Coverage: Tests all parameter combinations systematically
Best for: Medium-sized parameter spaces with guaranteed coverage

Example 3: Intelligent SVM Optimization

Advanced Halving Random Search

This example demonstrates intelligent optimization for larger parameter spaces:

[EXP]
root = ./examples/results/
name = exp_polish_optim_svm_intelligent
runs = 1
epochs = 1

[DATA]
databases = ['train', 'dev', 'test']
train = ./data/polish/polish_train.csv
train.type = csv
train.absolute_path = False
train.split_strategy = train
dev = ./data/polish/polish_dev.csv
dev.type = csv
dev.absolute_path = False
dev.split_strategy = train
test = ./data/polish/polish_test.csv
test.type = csv
test.absolute_path = False
test.split_strategy = test
target = emotion
labels = ['anger', 'neutral', 'fear']

[FEATS]
type = ['os']

[MODEL]
type = svm

[OPTIM]
model = svm
search_strategy = halving_random
n_iter = 15
cv_folds = 3
C_val = [0.1, 1.0, 10.0, 100.0, 1000.0]
kernel = ["linear", "rbf", "poly"]
gamma = ["scale", "auto", 0.001, 0.01, 0.1, 1.0]
metric = uar

Key features:

Strategy: halving_random for maximum efficiency
Parameter space: 5×3×6 = 90 combinations, intelligently sampled
Efficiency: Only tests most promising combinations using successive halving
Iterations: Limited to 15 iterations for time efficiency
Best for: Large parameter spaces with time constraints

Example 4: Neural Network (MLP) Conventional Optimization

Grid Search for Deep Learning

This example shows conventional grid search for neural network hyperparameters:

[EXP]
root = ./examples/results/
name = exp_polish_optim_mlp
runs = 1
epochs = 5

[DATA]
databases = ['train', 'dev', 'test']
train = ./data/polish/polish_train.csv
train.type = csv
train.absolute_path = False
train.split_strategy = train
dev = ./data/polish/polish_dev.csv
dev.type = csv
dev.absolute_path = False
dev.split_strategy = train
test = ./data/polish/polish_test.csv
test.type = csv
test.absolute_path = False
test.split_strategy = test
target = emotion
labels = ['anger', 'neutral', 'fear']

[FEATS]
type = ['os']

[OPTIM]
model = mlp
nlayers = [1, 2]
nnodes = [16, 32]
lr = [0.0001, 0.001]
bs = [4, 8, 16, 32]
loss = ["cross", "f1"]
do = [0.1, 0.3, 0.5]
metric = uar

Key features:

Strategy: Default grid search for systematic exploration
Parameter space: 2×2×2×4×2×3 = 192 combinations
Architecture: Tests both network structure (layers, nodes) and training parameters
Training: Includes learning rate, batch size, dropout, and loss function
Comprehensive: Covers all major neural network hyperparameters

Choosing the Right Approach

When to Use Conventional Optimization

Ideal scenarios:

Parameter space ≤ 100 combinations
You have specific parameter values to test
Computational resources are abundant
You need guaranteed coverage of all combinations
You’re fine-tuning around known good parameters

Example parameter spaces for conventional:

# Small XGBoost optimization (3×3×2 = 18 combinations)
n_estimators = [50, 100, 200]
max_depth = [3, 6, 9]
learning_rate = [0.1, 0.2]

# Small SVM optimization (3×2×3 = 18 combinations)
C_val = [1.0, 10.0, 100.0]
kernel = ["linear", "rbf"]
gamma = [0.001, 0.01, 0.1]

When to Use Intelligent Optimization

Ideal scenarios:

Parameter space > 100 combinations
Limited computational time or resources
High-dimensional parameter spaces
Exploring wide parameter ranges
Initial parameter exploration

Example parameter spaces for intelligent:

# Large XGBoost optimization (4×4×4×3 = 192 combinations)
n_estimators = [50, 100, 200, 500]
max_depth = [3, 6, 9, 12]
learning_rate = [0.01, 0.05, 0.1, 0.2]
subsample = [0.7, 0.8, 0.9]

# With halving_grid, only tests ~50-60 combinations intelligently

Strategy Comparison

Strategy	Parameter Space Size	Time Efficiency	Coverage Guarantee	Best Use Case
`grid`	< 100 combinations	Low	100%	Small, specific searches
`random`	Any size	High	Statistical	Large spaces, quick exploration
`halving_grid`	> 50 combinations	Medium-High	High	Balanced efficiency and coverage
`halving_random`	> 100 combinations	Highest	Medium	Very large spaces, time-limited

This example demonstrates conventional grid search for SVM hyperparameters:

[EXP]
root = ./examples/results/
name = exp_polish_optim_svm
runs = 1
epochs = 1

[DATA]
databases = ['train', 'dev', 'test']
train = ./data/polish/polish_train.csv
train.type = csv
train.absolute_path = False
train.split_strategy = train
dev = ./data/polish/polish_dev.csv
dev.type = csv
dev.absolute_path = False
dev.split_strategy = train
test = ./data/polish/polish_test.csv
test.type = csv
test.absolute_path = False
test.split_strategy = test
target = emotion
labels = ['anger', 'neutral', 'fear']

[FEATS]
type = ['os']
scale = robust
balancing = smoteenn

[MODEL]
type = svm
C_val = 0.1
kernel = linear

[OPTIM]
model = svm
C_val = [0.1, 1.0, 10.0, 100.0] 
kernel = ["linear", "rbf", "poly"]
gamma = ["scale", "auto", 0.001, 0.01, 0.1, 1.0]
metric = uar

Key features of this example:

Uses default grid search strategy (conventional)
Tests 4×3×6 = 72 parameter combinations
Combines different parameter types (numerical and categorical)
Uses robust scaling for features
No explicit search strategy means conventional grid search

Example 3: Neural Network (MLP) Optimization

This example shows optimization for Multi-Layer Perceptron networks:

[EXP]
root = ./examples/results/
name = exp_polish_optim_mlp
runs = 1
epochs = 5

[DATA]
databases = ['train', 'dev', 'test']
train = ./data/polish/polish_train.csv
train.type = csv
train.absolute_path = False
train.split_strategy = train
dev = ./data/polish/polish_dev.csv
dev.type = csv
dev.absolute_path = False
dev.split_strategy = train
test = ./data/polish/polish_test.csv
test.type = csv
test.absolute_path = False
test.split_strategy = test
target = emotion
labels = ['anger', 'neutral', 'fear']

[FEATS]
type = ['os']

[OPTIM]
model = mlp
nlayers = [1, 2]
nnodes = [16, 32]
lr = [0.0001, 0.001]
bs = [4, 8, 16, 32]
loss = ["cross", "f1"]
do = [0.1, 0.3, 0.5]
metric = uar

Key features of this example:

Optimizes neural network architecture and training parameters
Tests 2×2×2×4×2×3 = 192 parameter combinations
Includes architectural parameters (layers, nodes)
Includes training parameters (learning rate, batch size, dropout)
Includes loss function selection
Uses conventional grid search for comprehensive evaluation

Configuration Parameters

Core Settings

Parameter	Description	Default	Options
`model`	Model type to optimize	`mlp`	`mlp`, `xgb`, `svm`, `knn`, `tree`, etc.
`search_strategy`	Search method	`grid`	`grid`, `random`, `halving_grid`, `halving_random`
`metric`	Optimization metric	`accuracy`	`uar`, `accuracy`, `f1`, `precision`, `recall`
`n_iter`	Number of iterations (for random search)	`50`	Any integer
`cv_folds`	Cross-validation folds	`3`	Any integer ≥ 2
`random_state`	Random seed for reproducibility	`42`	Any integer

Search Strategies

Grid Search (`grid`)

Best for: Small parameter spaces (< 100 combinations)
Pros: Exhaustive search, guaranteed to find the best combination
Cons: Computationally expensive for large parameter spaces

Random Search (`random`)

Best for: Large parameter spaces, limited time budget
Pros: More efficient than grid search for high-dimensional spaces
Cons: May miss optimal combinations

Halving Grid Search (`halving_grid`)

Best for: Large parameter spaces with successive halving
Pros: More efficient than regular grid search
Cons: Requires scikit-learn >= 0.24

Halving Random Search (`halving_random`)

Best for: Very large parameter spaces
Pros: Combines benefits of random search with successive halving
Cons: Most complex, may need tuning

Parameter Specification

List Format

Specify discrete values to test:

C_val = [0.1, 1.0, 10.0, 100.0]
kernel = ["linear", "rbf", "poly"]

Range Format

For continuous parameters (automatically generates reasonable steps):

# Creates range with smart step selection
learning_rate = (0.001, 0.1)  # Will generate log-spaced values
max_depth = (3, 10)           # Will generate integer range

Range with Step

Specify exact step size:

# (min, max, step)
dropout = (0.1, 0.5, 0.1)     # [0.1, 0.2, 0.3, 0.4, 0.5]

Model-Specific Parameters

XGBoost (`xgb`, `xgr`)

[OPTIM]
model = xgb
n_estimators = [50, 100, 200]
max_depth = [3, 6, 9, 12]
learning_rate = [0.01, 0.1, 0.3]
subsample = [0.6, 0.8, 1.0]
colsample_bytree = [0.6, 0.8, 1.0]

Support Vector Machine (`svm`, `svr`)

[OPTIM]
model = svm
C_val = [0.1, 1.0, 10.0, 100.0]
kernel = ["linear", "rbf", "poly"]
gamma = ["scale", "auto", 0.001, 0.01, 0.1, 1.0]

K-Nearest Neighbors (`knn`, `knn_reg`)

[OPTIM]
model = knn
K_val = [3, 5, 7, 9, 11]
weights = ["uniform", "distance"]
algorithm = ["auto", "ball_tree", "kd_tree", "brute"]

Multi-Layer Perceptron (`mlp`)

[OPTIM]
model = mlp
nlayers = [1, 2, 3]
nnodes = [16, 32, 64, 128]
lr = [0.0001, 0.001, 0.01]
bs = [8, 16, 32, 64]
do = [0.1, 0.3, 0.5]
loss = ["cross", "f1", "mse"]

Complete Examples

Example 1: XGBoost Optimization

[EXP]
root = ./results/
name = exp_polish_optim_xgb
runs = 1
epochs = 1

[DATA]
databases = ['train', 'dev', 'test']
train = ./data/polish/polish_train.csv
train.type = csv
train.split_strategy = train
dev = ./data/polish/polish_dev.csv
dev.type = csv
dev.split_strategy = train
test = ./data/polish/polish_test.csv
test.type = csv
test.split_strategy = test
target = emotion
labels = ['anger', 'neutral', 'fear']

[FEATS]
type = ['os']
scale = standard
balancing = smoteenn

[MODEL]
type = xgb
n_estimators = 50
max_depth = 6
learning_rate = 0.1

[OPTIM]
model = xgb
search_strategy = halving_grid
n_iter = 15
cv_folds = 3
random_state = 42
n_estimators = [50, 100, 200]
max_depth = [3, 6, 9]
learning_rate = [0.01, 0.1, 0.2]
subsample = [0.8, 0.9, 1.0]
metric = uar

Example 2: SVM Optimization

[EXP]
root = ./results/
name = exp_polish_optim_svm
runs = 1
epochs = 1

[DATA]
databases = ['train', 'dev', 'test']
train = ./data/polish/polish_train.csv
train.type = csv
train.split_strategy = train
dev = ./data/polish/polish_dev.csv
dev.type = csv
dev.split_strategy = train
test = ./data/polish/polish_test.csv
test.type = csv
test.split_strategy = test
target = emotion
labels = ['anger', 'neutral', 'fear']

[FEATS]
type = ['os']
scale = robust
balancing = smoteenn

[MODEL]
type = svm
C_val = 1.0
kernel = rbf

[OPTIM]
model = svm
search_strategy = grid
C_val = [0.1, 1.0, 10.0, 100.0]
kernel = ["linear", "rbf", "poly"]
gamma = ["scale", "auto", 0.001, 0.01, 0.1, 1.0]
metric = uar

Example 3: Neural Network (MLP) Optimization

[EXP]
root = ./results/
name = exp_polish_optim_mlp
runs = 1
epochs = 5

[DATA]
databases = ['train', 'dev', 'test']
train = ./data/polish/polish_train.csv
train.type = csv
train.split_strategy = train
dev = ./data/polish/polish_dev.csv
dev.type = csv
dev.split_strategy = train
test = ./data/polish/polish_test.csv
test.type = csv
test.split_strategy = test
target = emotion
labels = ['anger', 'neutral', 'fear']

[FEATS]
type = ['os']

[OPTIM]
model = mlp
search_strategy = random
n_iter = 20
cv_folds = 3
metric = uar
nlayers = [1, 2]
nnodes = [16, 32, 64]
lr = [0.0001, 0.001, 0.01]
bs = [8, 16, 32]
loss = ["cross", "f1"]
do = [0.1, 0.3, 0.5]

Understanding Results

Output Files

The optimization process creates several output files in your experiment directory:

optimization_results_{model}.csv: Detailed results for all parameter combinations
images/: Visualization plots (if enabled)
results/: Text-based result summaries

Result Interpretation

The optimization will output:

Best parameters: The parameter combination that achieved the highest score
Best score: The performance metric value for the best parameters
All results: Complete results table with all tested combinations

Cross-Validation vs. Final Evaluation

The module provides warnings when there are large discrepancies between cross-validation scores and final test evaluation. This helps identify potential overfitting issues.

Advanced Features

Consistency Improvements

The optimization module includes several features to ensure consistent results:

Stratified Cross-Validation: Maintains class distribution across CV folds
Consistent Data Balancing: Applies the same balancing strategy used in final evaluation
Reproducible Results: Fixed random seeds for consistent results
Validation Checks: Compares CV results with standard evaluation pipeline

Performance Optimization

Parallel Processing: Uses multiple CPU cores when available
Early Stopping: Halving strategies reduce computation time
Smart Parameter Ranges: Automatic generation of reasonable parameter ranges

Best Practices

1. Start Small

Begin with a small parameter space to understand your model’s behavior:

# Start with 2-3 values per parameter
C_val = [1.0, 10.0]
kernel = ["linear", "rbf"]

2. Use Appropriate Search Strategies

Grid search: For ≤ 50 parameter combinations
Random search: For > 50 combinations
Halving methods: For > 200 combinations

3. Choose the Right Metric

Classification: uar (Unweighted Average Recall) for imbalanced datasets
Balanced datasets: accuracy or f1
Regression: mse, mae, or r2

4. Set Reasonable CV Folds

Small datasets (< 1000 samples): 3-5 folds
Large datasets: 5-10 folds
Very small datasets: Consider leave-one-out CV

5. Monitor for Overfitting

Watch for large discrepancies between CV and test scores:

Cross-validation score: 0.8500
Standard evaluation score: 0.7200
Score difference: 0.1300
WARNING: Large discrepancy detected!

Troubleshooting

Common Issues

1. No Parameter Combinations Generated

Problem: Empty parameter space Solution: Check parameter syntax in the [OPTIM] section

2. Memory Issues

Problem: Too many parameter combinations Solution: Reduce parameter space or use random/halving search

3. Slow Optimization

Problem: Long execution time Solution:

Use fewer CV folds
Reduce parameter space
Use halving strategies

4. Poor Results

Problem: Optimized parameters perform worse than defaults Solutions:

Check for data leakage
Ensure consistent preprocessing
Verify parameter ranges are reasonable

Error Messages

“No [OPTIM] section found”

Add an [OPTIM] section to your configuration file.

“Large discrepancy between CV and standard evaluation”

This indicates potential overfitting or data inconsistency. Consider:

Reducing model complexity
Checking data preprocessing steps
Using simpler search strategies

Performance Tips

1. Efficient Parameter Ranges

Use logarithmic ranges for parameters that span multiple orders of magnitude:

# Good for learning rates
lr = [0.0001, 0.001, 0.01, 0.1]

# Instead of linear spacing
# lr = [0.0001, 0.0002, 0.0003, ..., 0.1]  # Too many values

2. Use Intelligent Defaults

Start with recommended parameter ranges:

# XGBoost recommended ranges
n_estimators = [50, 100, 200]        # Not [10, 20, 30, ..., 1000]
max_depth = [3, 6, 9]                # Not [1, 2, 3, ..., 20]
learning_rate = [0.01, 0.1, 0.3]     # Not [0.001, 0.002, ..., 1.0]

3. Parallel Processing

The module automatically uses multiple cores. For better performance:

Use machines with more CPU cores
Ensure sufficient RAM for parallel processes
Consider using smaller batch sizes for neural networks

Integration with Nkululeko Workflow

The optimization module integrates seamlessly with the standard Nkululeko workflow:

Setup: Same data and feature configuration as regular experiments
Optimization: Add [OPTIM] section and run optimization
Final Training: Use best parameters in a standard experiment
Evaluation: Compare optimized vs. default performance

After Optimization

Once you find the best parameters, update your model configuration:

[MODEL]
type = xgb
n_estimators = 200    # From optimization results
max_depth = 6         # From optimization results
learning_rate = 0.1   # From optimization results
subsample = 0.9       # From optimization results

Then run a standard experiment to get final results:

python3 -m nkululeko.nkululeko --config your_final_config.ini

Summary

The Nkululeko optimization module provides a powerful and flexible framework for hyperparameter tuning. Key benefits include:

Automated parameter search with multiple strategies
Consistent evaluation pipeline to avoid overfitting
Support for all model types in Nkululeko
Detailed result analysis and performance monitoring
Best practice recommendations built-in

Start with simple grid searches and gradually move to more sophisticated methods as your parameter spaces grow. Always validate your results with independent test sets and watch for overfitting indicators.

nkululeko.optim

Hyperparameter Optimization Module

Quick Start

Basic Usage

Minimal Example

Optimization Approaches

1. Conventional Optimization

2. Intelligent Optimization

Real-World Examples with Polish Emotional Speech Dataset

Example 1: Conventional XGBoost Optimization

Example 2: Conventional SVM Grid Search

Example 3: Intelligent SVM Optimization

Example 4: Neural Network (MLP) Conventional Optimization

Choosing the Right Approach

When to Use Conventional Optimization

When to Use Intelligent Optimization

Strategy Comparison

Example 3: Neural Network (MLP) Optimization

Configuration Parameters

Core Settings

Search Strategies

Grid Search (grid)

Random Search (random)

Halving Grid Search (halving_grid)

Halving Random Search (halving_random)

Parameter Specification

List Format

Range Format

Range with Step

Model-Specific Parameters

XGBoost (xgb, xgr)

Support Vector Machine (svm, svr)

K-Nearest Neighbors (knn, knn_reg)

Multi-Layer Perceptron (mlp)

Complete Examples

Example 1: XGBoost Optimization

Example 2: SVM Optimization

Example 3: Neural Network (MLP) Optimization

Understanding Results

Output Files

Result Interpretation

Cross-Validation vs. Final Evaluation

Advanced Features

Consistency Improvements

Performance Optimization

Best Practices

1. Start Small

2. Use Appropriate Search Strategies

3. Choose the Right Metric

4. Set Reasonable CV Folds

5. Monitor for Overfitting

Troubleshooting

Common Issues

1. No Parameter Combinations Generated

2. Memory Issues

3. Slow Optimization

4. Poor Results

Error Messages

“No [OPTIM] section found”

“Large discrepancy between CV and standard evaluation”

Performance Tips

1. Efficient Parameter Ranges

2. Use Intelligent Defaults

3. Parallel Processing

Integration with Nkululeko Workflow

After Optimization

Summary

Grid Search (`grid`)

Random Search (`random`)

Halving Grid Search (`halving_grid`)

Halving Random Search (`halving_random`)

XGBoost (`xgb`, `xgr`)

Support Vector Machine (`svm`, `svr`)

K-Nearest Neighbors (`knn`, `knn_reg`)

Multi-Layer Perceptron (`mlp`)