nkululeko.flags
Nkululeko Flags Module Tutorial
The flags module in nkululeko allows you to run multiple experiments with different parameter combinations automatically. This is particularly useful for hyperparameter tuning and systematic exploration of different configurations.
Overview
Instead of manually creating multiple configuration files and running experiments one by one, the flags module enables you to:
Define multiple values for different parameters in a single configuration file
Automatically generate all possible combinations of these parameters
Run all experiments sequentially with optimized feature extraction
Get a summary of results and identify the best performing configuration
Quick Start Demo
To quickly try the flags module, you can use the provided demo example:
# Clone the repository and navigate to it
cd nkululeko
# Run the flags demo (this will test 2×2×2×2 = 16 combinations)
python -m nkululeko.flags --config examples/exp_flags_demo.ini
This demo uses the test dataset included with nkululeko and will show you how the flags module works with a small, manageable number of experiments.
Basic Usage
1. Configuration File Structure
To use the flags module, you need to add a [FLAGS] section to your standard nkululeko configuration file. Here’s the basic structure:
[EXP]
root = /tmp/results/
name = my_flags_experiment
runs = 1
epochs = 10
[DATA]
databases = ['mydata']
mydata = ./data/mydata.csv
mydata.type = csv
target = emotion
[FLAGS]
models = ['xgb', 'svm', 'mlp']
features = ['os', 'praat', 'mfcc']
balancing = ['none', 'ros', 'smote']
scale = ['none', 'standard', 'robust']
2. Running with Flags
There are several ways to run experiments with flags:
Method 1: Command line with –flags argument
python -m nkululeko.flags --config my_config.ini --flags
Method 2: Automatic detection
If your configuration file contains a [FLAGS] section, nkululeko will automatically detect it:
python -m nkululeko.flags --config my_config.ini
Method 3: Command line parameter override
You can also override specific parameters from the command line:
# Override model and feature types
python -m nkululeko.flags --config my_config.ini --model xgb --feat os
# Specify multiple values for testing
python -m nkululeko.flags --config my_config.ini --model xgb svm --feat "['os', 'praat']"
# Combine with other parameters
python -m nkululeko.flags --config my_config.ini --balancing smote --scale standard --epochs 50
3. Module Selection
The flags module supports different nkululeko modules:
# Standard training (default)
python -m nkululeko.flags --config my_config.ini --mod nkulu
# Testing mode
python -m nkululeko.flags --config my_config.ini --mod test
Supported Flag Parameters
The flags module supports the following parameters:
Core Parameters
models: List of model types to testExample:
['xgb', 'svm', 'mlp', 'tree']
features: List of feature types to testExample:
['os', 'praat', 'mfcc', 'hubert', 'wav2vec2']
balancing: List of balancing methods to testExample:
['none', 'ros', 'smote', 'adasyn']
scale: List of scaling methods to testExample:
['none', 'standard', 'robust', 'minmax']
Custom Parameters
You can also define custom parameters using prefixes:
model_*: Model-specific parameters (e.g.,model_learning_rate)feats_*: Feature-specific parameters (e.g.,feats_set)exp_*: Experiment-specific parameters (e.g.,exp_epochs)
Working Example
Let’s look at a complete working example based on the Polish emotion dataset:
[EXP]
root = /tmp/results/
name = exp_polish_flags1
[DATA]
databases = ['train', 'dev', 'test']
train = ./data/polish/polish_train.csv
train.type = csv
train.absolute_path = False
train.split_strategy = train
dev = ./data/polish/polish_dev.csv
dev.type = csv
dev.absolute_path = False
dev.split_strategy = train
test = ./data/polish/polish_test.csv
test.type = csv
test.absolute_path = False
test.split_strategy = test
target = emotion
[FLAGS]
models = ['xgb', 'svm']
features = ['praat', 'os']
balancing = ['none', 'ros', 'smote']
scale = ['none', 'standard', 'robust', 'minmax']
This configuration will run 2 × 2 × 3 × 4 = 48 experiments testing all combinations of:
2 models (XGBoost, SVM)
2 feature types (Praat, OpenSMILE)
3 balancing methods (none, ROS, SMOTE)
4 scaling methods (none, standard, robust, minmax)
Advanced Usage
1. Single Value Parameters
If you want to test a single value, you can specify it as a string without brackets:
[FLAGS]
models = 'xgb'
features = ['os', 'praat']
2. Mixed Data Types
The flags module supports different data types:
[FLAGS]
models = ['xgb', 'mlp']
features = ['os', 'praat']
learning_rate = [0.01, 0.1, 0.5]
epochs = [10, 50, 100]
3. Handling ‘none’ Values
When you specify 'none' for balancing or scaling, the parameter won’t be set in the configuration, using the default behavior:
[FLAGS]
balancing = ['none', 'smote'] # 'none' means no balancing parameter set
scale = ['none', 'standard'] # 'none' means no scaling parameter set
Understanding the Output
When you run a flags experiment, you’ll see output like this:
Flag parameters found: {'models': ['xgb', 'svm'], 'features': ['praat', 'os'], 'balancing': ['none', 'ros', 'smote'], 'scale': ['none', 'standard', 'robust', 'minmax']}
Running 48 experiment combinations...
Setting up experiment and extracting features (once for all experiments)...
Features extracted once: (1200, 384)
=== Experiment 1/48 ===
Parameters: {'models': 'xgb', 'features': 'praat', 'balancing': 'none', 'scale': 'none'}
Result: 0.67
=== Experiment 2/48 ===
Parameters: {'models': 'xgb', 'features': 'praat', 'balancing': 'none', 'scale': 'standard'}
Result: 0.71
...
=== SUMMARY OF 48 EXPERIMENTS ===
Experiment 1: {'models': 'xgb', 'features': 'praat', 'balancing': 'none', 'scale': 'none'}
Result: 0.67
Experiment 2: {'models': 'xgb', 'features': 'praat', 'balancing': 'none', 'scale': 'standard'}
Result: 0.71
...
=== BEST CONFIGURATION ===
Best Result: 0.84
Best Parameters:
models: svm
features: os
balancing: smote
scale: robust
To use these parameters, set in your config file:
[MODEL]
type = svm
[FEATS]
type = ['os']
balancing = smote
scale = robust
Flags experiments time: 245.67 seconds (4.09 minutes)
DONE
Key Output Elements
Parameter Discovery: Shows all flag parameters found and their values
Feature Extraction Info: Displays the shape of extracted features (once for all experiments)
Individual Results: Each experiment shows its parameters and result score
Comprehensive Summary: Lists all experiments with their outcomes
Best Configuration: Identifies the highest-scoring parameter combination
Usage Instructions: Provides exact configuration syntax for the best parameters
Timing Information: Shows total execution time
Result Interpretation
Result scores represent the test performance metric (typically accuracy for classification)
Higher scores indicate better performance
Feature extraction timing is shown separately since it’s done only once
Failed experiments are marked with ERROR and don’t affect other experiments
Performance Optimization
The flags module includes several optimizations:
Single Feature Extraction: Features are extracted only once at the beginning and reused across all experiments
Efficient Experiment Creation: Each experiment reuses the base data and features rather than reloading
Memory Optimization: Configurations are generated on-the-fly rather than stored in memory
Best Practices
1. Start Small
Begin with a smaller set of parameters to test the setup:
[FLAGS]
models = ['xgb']
features = ['os', 'praat']
balancing = ['none', 'smote']
2. Consider Computational Cost
Be mindful of the total number of combinations. With many parameters, the number of experiments can grow exponentially:
3 models × 4 features × 3 balancing × 4 scaling = 144 experiments
3. Use Meaningful Parameter Combinations
Not all parameter combinations make sense. For example, some scaling methods might not be beneficial for certain feature types.
4. Monitor Resources
Large flag experiments can be resource-intensive. Monitor CPU, memory, and disk usage, especially when working with large datasets.
Troubleshooting
Common Issues
No FLAGS section error: Ensure your configuration file has a
[FLAGS]sectionInvalid parameter format: Use proper Python list syntax with quotes:
['item1', 'item2']Missing required sections: Ensure your config file has all required sections (EXP, DATA, etc.)
Error Handling
The flags module includes error handling for individual experiments. If one experiment fails, others will continue, and you’ll see error information in the summary.
Integration with Other Modules
The flags module can be combined with other nkululeko features:
With Testing Module
python -m nkululeko.flags --config my_config.ini --mod test --flags
With Custom Modules
The flags module supports different nkululeko modules through the --mod parameter:
nkulu: Standard training (default)test: Testing mode
Example Workflows
1. Feature Type Comparison
[FLAGS]
models = ['xgb']
features = ['os', 'praat', 'mfcc', 'hubert']
balancing = ['none']
scale = ['standard']
2. Model Selection
[FLAGS]
models = ['xgb', 'svm', 'mlp', 'tree']
features = ['os']
balancing = ['smote']
scale = ['standard']
3. Preprocessing Optimization
[FLAGS]
models = ['xgb']
features = ['os']
balancing = ['none', 'ros', 'smote', 'adasyn']
scale = ['none', 'standard', 'robust', 'minmax']
The flags module is a powerful tool for systematic experimentation in nkululeko, helping you find optimal configurations efficiently while maintaining reproducibility and comprehensive result tracking.
Migration from nkuluflag
If you’ve used the legacy nkuluflag module, here are the key differences and migration steps:
Key Improvements in flags module:
Configuration-based: Define parameters in INI files instead of command line only
Optimized performance: Features extracted once and reused across experiments
Better error handling: Failed experiments don’t stop the entire process
Comprehensive output: Clear summary with best configuration identification
Flexible parameters: Support for custom parameter types and prefixes
Migration Steps:
Old nkuluflag approach:
python -m nkululeko.nkuluflag --config base.ini --model xgb svm --feat os praat --balancing none smote
New flags approach:
Add a
[FLAGS]section to your INI file:
[FLAGS]
models = ['xgb', 'svm']
features = ['os', 'praat']
balancing = ['none', 'smote']
Run with the flags module:
python -m nkululeko.flags --config base.ini
The new approach is more maintainable, reproducible, and provides better performance for large parameter spaces.