Using Split Train and Test Data

This tutorial shows how to configure separate train and test sets using the split_strategy option, and how to confirm results using the test module.

Reference: How to use train, dev and test splits with Nkululeko

Overview

In machine learning, the typical workflow is:

  1. Train your model on a training set

  2. Tune hyperparameters on a development (validation) set

  3. Evaluate on a held-out test set

Nkululeko can directly handle train/test splits in a single experiment using the split_strategy option.

Using Split Strategy

The simplest way to define train and test sets is using separate databases with split_strategy:

[EXP]
root = ./examples/results/
name = exp_polish_splits
save = True

[DATA]
databases = ['train', 'dev', 'test']
train = ./data/polish/polish_train.csv
train.type = csv
train.absolute_path = False
train.split_strategy = train
dev = ./data/polish/polish_dev.csv
dev.type = csv
dev.absolute_path = False
dev.split_strategy = train
test = ./data/polish/polish_test.csv
test.type = csv
test.absolute_path = False
test.split_strategy = test
target = emotion

[FEATS]
type = ['os']
scale = standard

[MODEL]
type = xgb
save = True

Key Configuration

Option

Description

<db>.split_strategy = train

Use this database for training

<db>.split_strategy = test

Use this database for testing only

<db>.split_strategy = dev

Use this database for development/validation

Running the Experiment

With split strategies defined, run a single command:

python -m nkululeko.nkululeko --config myconf.ini

This trains on the train/dev data and evaluates on the test set in one go.

Confirming Results with the Predict Module

After running your experiment, you can use the unified predict module to re-evaluate your saved model on a labeled test set. This is useful to:

  • Confirm the results from your experiment

  • Evaluate the model on additional test sets

  • Generate detailed test reports

Using the predict module in model mode

First, ensure your model is saved by setting save = True in both [EXP] and [MODEL] sections. Then run:

python -m nkululeko.predict \
    --config myconf.ini \
    --type model \
    --list ./data/polish/polish_test.csv \
    --outfile polish_test_predict.csv

The predict module will:

  1. Load the saved best model from the experiment

  2. Run it on every audio file listed in the CSV

  3. Write predictions next to the original columns into --outfile

Defining Test Databases

You can also specify test databases using the tests option:

[DATA]
databases = ['emodb']
emodb = ./data/emodb/emodb
emodb.split_strategy = speaker_split
target = emotion
labels = ['anger', 'happiness', 'neutral', 'sadness']
; Define additional test databases
tests = ['crema-d']
crema-d = ./data/crema-d/crema-d
crema-d.split_strategy = test

Automatic fast path: when DATA.tests is set and a saved experiment already exists on disk, nkululeko.nkululeko skips training on subsequent runs and evaluates the stored best model on the test databases directly. The same config file therefore works for both the initial training run and all later test evaluations without any changes. See test_new_database.md for the full workflow.

Cross-Database Evaluation

A common use case is training on one database and testing on another:

[DATA]
databases = ['emodb', 'crema-d']
emodb = ./data/emodb/emodb
emodb.split_strategy = train
target = emotion
labels = ['anger', 'happiness']
; Test on a different database
crema-d = ./data/crema-d/crema-d
crema-d.split_strategy = test

This evaluates how well your model generalizes to unseen data from a different source.

Example Files

Tips

  1. Save your model: Set save = True in both [EXP] and [MODEL] sections to use the test module

  2. Use split_strategy consistently: Set train, dev, or test for each database

  3. Matching labels: Ensure test database has the same labels as training data

  4. Final evaluation: Only evaluate on test set once, after all hyperparameter tuning is complete

Comparing Multiple Configurations with Flags Module

To systematically compare different models, features, and preprocessing options, use the flags module:

[EXP]
root = ./examples/results/
name = exp_polish_flags

[DATA]
databases = ['train', 'dev', 'test']
train = ./data/polish/polish_train.csv
train.type = csv
train.split_strategy = train
dev = ./data/polish/polish_dev.csv
dev.type = csv
dev.split_strategy = train
test = ./data/polish/polish_test.csv
test.type = csv
test.split_strategy = test
target = emotion

[FEATS]
; Leave empty - will be set by FLAGS

[MODEL]
; Leave empty - will be set by FLAGS

[FLAGS]
models = ['xgb', 'svm']  
features = ['praat', 'os']   
balancing = ['none', 'ros', 'smote']  
scale = ['none', 'standard', 'robust', 'minmax']  

Running the Flags Module

python -m nkululeko.flags --config exp_polish_flags.ini

This will automatically run all combinations:

  • 2 models × 2 feature sets × 3 balancing methods × 4 scalers = 48 experiments

Flags Options

Flag

Values

Description

models

['svm', 'xgb', 'mlp', 'knn', ...]

Model types to compare

features

['os', 'praat', 'wav2vec2', ...]

Feature extractors

balancing

['none', 'ros', 'smote', ...]

Class balancing methods

scale

['none', 'standard', 'robust', 'minmax']

Feature scaling