nkululeko.flags

Nkululeko Flags Module Tutorial

The flags module in nkululeko allows you to run multiple experiments with different parameter combinations automatically. This is particularly useful for hyperparameter tuning and systematic exploration of different configurations.

Overview

Instead of manually creating multiple configuration files and running experiments one by one, the flags module enables you to:

Define multiple values for different parameters in a single configuration file
Automatically generate all possible combinations of these parameters
Run all experiments sequentially with optimized feature extraction
Get a summary of results and identify the best performing configuration

Quick Start Demo

To quickly try the flags module, you can use the provided demo example:

# Clone the repository and navigate to it
cd nkululeko

# Run the flags demo (this will test 2×2×2×2 = 16 combinations)
python -m nkululeko.flags --config examples/exp_flags_demo.ini

This demo uses the test dataset included with nkululeko and will show you how the flags module works with a small, manageable number of experiments.

Basic Usage

1. Configuration File Structure

To use the flags module, you need to add a [FLAGS] section to your standard nkululeko configuration file. Here’s the basic structure:

[EXP]
root = /tmp/results/
name = my_flags_experiment
runs = 1
epochs = 10

[DATA]
databases = ['mydata']
mydata = ./data/mydata.csv
mydata.type = csv
target = emotion

[FLAGS]
models = ['xgb', 'svm', 'mlp']
features = ['os', 'praat', 'mfcc']   
balancing = ['none', 'ros', 'smote']  
scale = ['none', 'standard', 'robust']

2. Running with Flags

There are several ways to run experiments with flags:

Method 1: Command line with –flags argument

python -m nkululeko.flags --config my_config.ini --flags

Method 2: Automatic detection

If your configuration file contains a [FLAGS] section, nkululeko will automatically detect it:

python -m nkululeko.flags --config my_config.ini

Method 3: Command line parameter override

You can also override specific parameters from the command line:

# Override model and feature types
python -m nkululeko.flags --config my_config.ini --model xgb --feat os

# Specify multiple values for testing
python -m nkululeko.flags --config my_config.ini --model xgb svm --feat "['os', 'praat']"

# Combine with other parameters
python -m nkululeko.flags --config my_config.ini --balancing smote --scale standard --epochs 50

3. Module Selection

The flags module supports different nkululeko modules:

# Standard training (default)
python -m nkululeko.flags --config my_config.ini --mod nkulu

# Testing mode
python -m nkululeko.flags --config my_config.ini --mod test

Supported Flag Parameters

The flags module supports the following parameters:

Core Parameters

models: List of model types to test
- Example: ['xgb', 'svm', 'mlp', 'tree']
features: List of feature types to test
- Example: ['os', 'praat', 'mfcc', 'hubert', 'wav2vec2']
balancing: List of balancing methods to test
- Example: ['none', 'ros', 'smote', 'adasyn']
scale: List of scaling methods to test
- Example: ['none', 'standard', 'robust', 'minmax']

Custom Parameters

You can also define custom parameters using prefixes:

model_*: Model-specific parameters (e.g., model_learning_rate)
feats_*: Feature-specific parameters (e.g., feats_set)
exp_*: Experiment-specific parameters (e.g., exp_epochs)

Working Example

Let’s look at a complete working example based on the Polish emotion dataset:

[EXP]
root = /tmp/results/
name = exp_polish_flags1

[DATA]
databases = ['train', 'dev', 'test']
train = ./data/polish/polish_train.csv
train.type = csv
train.absolute_path = False
train.split_strategy = train
dev = ./data/polish/polish_dev.csv
dev.type = csv
dev.absolute_path = False
dev.split_strategy = train
test = ./data/polish/polish_test.csv
test.type = csv
test.absolute_path = False
test.split_strategy = test
target = emotion

[FLAGS]
models = ['xgb', 'svm']
features = ['praat', 'os']   
balancing = ['none', 'ros', 'smote']  
scale = ['none', 'standard', 'robust', 'minmax']

This configuration will run 2 × 2 × 3 × 4 = 48 experiments testing all combinations of:

2 models (XGBoost, SVM)
2 feature types (Praat, OpenSMILE)
3 balancing methods (none, ROS, SMOTE)
4 scaling methods (none, standard, robust, minmax)

Advanced Usage

1. Single Value Parameters

If you want to test a single value, you can specify it as a string without brackets:

[FLAGS]
models = 'xgb'
features = ['os', 'praat']

2. Mixed Data Types

The flags module supports different data types:

[FLAGS]
models = ['xgb', 'mlp']
features = ['os', 'praat']
learning_rate = [0.01, 0.1, 0.5]
epochs = [10, 50, 100]

3. Handling ‘none’ Values

When you specify 'none' for balancing or scaling, the parameter won’t be set in the configuration, using the default behavior:

[FLAGS]
balancing = ['none', 'smote']  # 'none' means no balancing parameter set
scale = ['none', 'standard']   # 'none' means no scaling parameter set

Understanding the Output

When you run a flags experiment, you’ll see output like this:

Flag parameters found: {'models': ['xgb', 'svm'], 'features': ['praat', 'os'], 'balancing': ['none', 'ros', 'smote'], 'scale': ['none', 'standard', 'robust', 'minmax']}
Running 48 experiment combinations...
Setting up experiment and extracting features (once for all experiments)...
Features extracted once: (1200, 384)

=== Experiment 1/48 ===
Parameters: {'models': 'xgb', 'features': 'praat', 'balancing': 'none', 'scale': 'none'}
Result: 0.67

=== Experiment 2/48 ===
Parameters: {'models': 'xgb', 'features': 'praat', 'balancing': 'none', 'scale': 'standard'}
Result: 0.71

...

=== SUMMARY OF 48 EXPERIMENTS ===
Experiment 1: {'models': 'xgb', 'features': 'praat', 'balancing': 'none', 'scale': 'none'}
  Result: 0.67
Experiment 2: {'models': 'xgb', 'features': 'praat', 'balancing': 'none', 'scale': 'standard'}
  Result: 0.71
...

=== BEST CONFIGURATION ===
Best Result: 0.84
Best Parameters:
  models: svm
  features: os
  balancing: smote
  scale: robust

To use these parameters, set in your config file:
[MODEL]
type = svm
[FEATS]
type = ['os']
balancing = smote
scale = robust

Flags experiments time: 245.67 seconds (4.09 minutes)
DONE

Key Output Elements

Parameter Discovery: Shows all flag parameters found and their values
Feature Extraction Info: Displays the shape of extracted features (once for all experiments)
Individual Results: Each experiment shows its parameters and result score
Comprehensive Summary: Lists all experiments with their outcomes
Best Configuration: Identifies the highest-scoring parameter combination
Usage Instructions: Provides exact configuration syntax for the best parameters
Timing Information: Shows total execution time

Result Interpretation

Result scores represent the test performance metric (typically accuracy for classification)
Higher scores indicate better performance
Feature extraction timing is shown separately since it’s done only once
Failed experiments are marked with ERROR and don’t affect other experiments

Performance Optimization

The flags module includes several optimizations:

Single Feature Extraction: Features are extracted only once at the beginning and reused across all experiments
Efficient Experiment Creation: Each experiment reuses the base data and features rather than reloading
Memory Optimization: Configurations are generated on-the-fly rather than stored in memory

Best Practices

1. Start Small

Begin with a smaller set of parameters to test the setup:

[FLAGS]
models = ['xgb']
features = ['os', 'praat']
balancing = ['none', 'smote']

2. Consider Computational Cost

Be mindful of the total number of combinations. With many parameters, the number of experiments can grow exponentially:

3 models × 4 features × 3 balancing × 4 scaling = 144 experiments

3. Use Meaningful Parameter Combinations

Not all parameter combinations make sense. For example, some scaling methods might not be beneficial for certain feature types.

4. Monitor Resources

Large flag experiments can be resource-intensive. Monitor CPU, memory, and disk usage, especially when working with large datasets.

Troubleshooting

Common Issues

No FLAGS section error: Ensure your configuration file has a [FLAGS] section
Invalid parameter format: Use proper Python list syntax with quotes: ['item1', 'item2']
Missing required sections: Ensure your config file has all required sections (EXP, DATA, etc.)

Error Handling

The flags module includes error handling for individual experiments. If one experiment fails, others will continue, and you’ll see error information in the summary.

Integration with Other Modules

The flags module can be combined with other nkululeko features:

With Testing Module

python -m nkululeko.flags --config my_config.ini --mod test --flags

With Custom Modules

The flags module supports different nkululeko modules through the --mod parameter:

nkulu: Standard training (default)
test: Testing mode

Example Workflows

1. Feature Type Comparison

[FLAGS]
models = ['xgb']
features = ['os', 'praat', 'mfcc', 'hubert']
balancing = ['none']
scale = ['standard']

2. Model Selection

[FLAGS]
models = ['xgb', 'svm', 'mlp', 'tree']
features = ['os']
balancing = ['smote']
scale = ['standard']

3. Preprocessing Optimization

[FLAGS]
models = ['xgb']
features = ['os']
balancing = ['none', 'ros', 'smote', 'adasyn']
scale = ['none', 'standard', 'robust', 'minmax']

The flags module is a powerful tool for systematic experimentation in nkululeko, helping you find optimal configurations efficiently while maintaining reproducibility and comprehensive result tracking.

Migration from nkuluflag

If you’ve used the legacy nkuluflag module, here are the key differences and migration steps:

Key Improvements in flags module:

Configuration-based: Define parameters in INI files instead of command line only
Optimized performance: Features extracted once and reused across experiments
Better error handling: Failed experiments don’t stop the entire process
Comprehensive output: Clear summary with best configuration identification
Flexible parameters: Support for custom parameter types and prefixes

Migration Steps:

Old nkuluflag approach:

python -m nkululeko.nkuluflag --config base.ini --model xgb svm --feat os praat --balancing none smote

New flags approach:

Add a [FLAGS] section to your INI file:

[FLAGS]
models = ['xgb', 'svm']
features = ['os', 'praat']
balancing = ['none', 'smote']

Run with the flags module:

python -m nkululeko.flags --config base.ini

The new approach is more maintainable, reproducible, and provides better performance for large parameter spaces.