Using Split Train and Test Data
This tutorial shows how to configure separate train and test sets using the split_strategy option, and how to confirm results using the test module.
Reference: How to use train, dev and test splits with Nkululeko
Overview
In machine learning, the typical workflow is:
Train your model on a training set
Tune hyperparameters on a development (validation) set
Evaluate on a held-out test set
Nkululeko can directly handle train/test splits in a single experiment using the split_strategy option.
Using Split Strategy
The simplest way to define train and test sets is using separate databases with split_strategy:
[EXP]
root = ./examples/results/
name = exp_polish_splits
save = True
[DATA]
databases = ['train', 'dev', 'test']
train = ./data/polish/polish_train.csv
train.type = csv
train.absolute_path = False
train.split_strategy = train
dev = ./data/polish/polish_dev.csv
dev.type = csv
dev.absolute_path = False
dev.split_strategy = train
test = ./data/polish/polish_test.csv
test.type = csv
test.absolute_path = False
test.split_strategy = test
target = emotion
[FEATS]
type = ['os']
scale = standard
[MODEL]
type = xgb
save = True
Key Configuration
Option |
Description |
|---|---|
|
Use this database for training |
|
Use this database for testing only |
|
Use this database for development/validation |
Running the Experiment
With split strategies defined, run a single command:
python -m nkululeko.nkululeko --config myconf.ini
This trains on the train/dev data and evaluates on the test set in one go.
Confirming Results with the Predict Module
After running your experiment, you can use the unified predict module to re-evaluate your saved model on a labeled test set. This is useful to:
Confirm the results from your experiment
Evaluate the model on additional test sets
Generate detailed test reports
Using the predict module in model mode
First, ensure your model is saved by setting save = True in both [EXP] and [MODEL] sections. Then run:
python -m nkululeko.predict \
--config myconf.ini \
--type model \
--list ./data/polish/polish_test.csv \
--outfile polish_test_predict.csv
The predict module will:
Load the saved best model from the experiment
Run it on every audio file listed in the CSV
Write predictions next to the original columns into
--outfile
Defining Test Databases
You can also specify test databases using the tests option:
[DATA]
databases = ['emodb']
emodb = ./data/emodb/emodb
emodb.split_strategy = speaker_split
target = emotion
labels = ['anger', 'happiness', 'neutral', 'sadness']
; Define additional test databases
tests = ['crema-d']
crema-d = ./data/crema-d/crema-d
crema-d.split_strategy = test
Automatic fast path: when
DATA.testsis set and a saved experiment already exists on disk,nkululeko.nkululekoskips training on subsequent runs and evaluates the stored best model on the test databases directly. The same config file therefore works for both the initial training run and all later test evaluations without any changes. See test_new_database.md for the full workflow.
Cross-Database Evaluation
A common use case is training on one database and testing on another:
[DATA]
databases = ['emodb', 'crema-d']
emodb = ./data/emodb/emodb
emodb.split_strategy = train
target = emotion
labels = ['anger', 'happiness']
; Test on a different database
crema-d = ./data/crema-d/crema-d
crema-d.split_strategy = test
This evaluates how well your model generalizes to unseen data from a different source.
Example Files
exp_emodb_os_xgb_test.ini: Cross-database evaluation exampleexp_polish_flags.ini: Flags module for systematic comparison
Tips
Save your model: Set
save = Truein both[EXP]and[MODEL]sections to use the test moduleUse split_strategy consistently: Set
train,dev, ortestfor each databaseMatching labels: Ensure test database has the same labels as training data
Final evaluation: Only evaluate on test set once, after all hyperparameter tuning is complete
Comparing Multiple Configurations with Flags Module
To systematically compare different models, features, and preprocessing options, use the flags module:
[EXP]
root = ./examples/results/
name = exp_polish_flags
[DATA]
databases = ['train', 'dev', 'test']
train = ./data/polish/polish_train.csv
train.type = csv
train.split_strategy = train
dev = ./data/polish/polish_dev.csv
dev.type = csv
dev.split_strategy = train
test = ./data/polish/polish_test.csv
test.type = csv
test.split_strategy = test
target = emotion
[FEATS]
; Leave empty - will be set by FLAGS
[MODEL]
; Leave empty - will be set by FLAGS
[FLAGS]
models = ['xgb', 'svm']
features = ['praat', 'os']
balancing = ['none', 'ros', 'smote']
scale = ['none', 'standard', 'robust', 'minmax']
Running the Flags Module
python -m nkululeko.flags --config exp_polish_flags.ini
This will automatically run all combinations:
2 models × 2 feature sets × 3 balancing methods × 4 scalers = 48 experiments
Flags Options
Flag |
Values |
Description |
|---|---|---|
|
|
Model types to compare |
|
|
Feature extractors |
|
|
Class balancing methods |
|
|
Feature scaling |