Feature Scaling in Nkululeko
Feature scaling is a crucial preprocessing step in machine learning that standardizes the range of features to improve model performance and convergence. The Nkululeko framework provides a comprehensive Scaler class that offers multiple scaling strategies to normalize speech features.
Table of Contents
Overview
The Scaler class in Nkululeko (nkululeko/scaler.py) handles feature normalization across training, development, and test sets. It ensures that:
Features are scaled consistently across all datasets
The scaling parameters are learned only from the training set
Different scaling strategies can be applied based on data characteristics
Speaker-specific normalization is supported
Available Scaling Methods
1. Standard Scaling (standard)
Z-score normalization - transforms features to have zero mean and unit variance.
[FEATS]
scale = standard
Formula: (x - mean) / std
Use case: Most commonly used method, works well when features follow a normal distribution.
2. Robust Scaling (robust)
Robust to outliers - uses median and interquartile range instead of mean and standard deviation.
[FEATS]
scale = robust
Formula: (x - median) / IQR
Use case: Recommended when features contain outliers that could skew standard scaling.
3. Min-Max Scaling (minmax)
Range normalization - scales features to a fixed range [0, 1].
[FEATS]
scale = minmax
Formula: (x - min) / (max - min)
Use case: When you need features bounded to a specific range, especially for neural networks.
4. Max-Abs Scaling (maxabs)
Absolute maximum scaling - scales by the maximum absolute value.
[FEATS]
scale = maxabs
Formula: x / max(|x|)
Use case: Preserves sparsity in sparse datasets and handles both positive and negative values.
5. Normalizer (normalizer)
L2 normalization - scales individual samples to have unit norm.
[FEATS]
scale = normalizer
Use case: When the direction of the data vector is more important than the magnitude.
6. Power Transformer (powertransformer)
Gaussian-like transformation - applies power transformations to make data more Gaussian.
[FEATS]
scale = powertransformer
Use case: When features have skewed distributions and you want to make them more normal.
7. Quantile Transformer (quantiletransformer)
Uniform/Gaussian mapping - maps features to uniform or Gaussian distribution.
[FEATS]
scale = quantiletransformer
Use case: When you want to reduce the impact of outliers and enforce a specific distribution.
8. Binning (bins)
Categorical binning - converts continuous features into three categorical bins.
[FEATS]
scale = bins
Output: Features are converted to strings: “0” (low), “0.5” (medium), “1” (high) Thresholds: 33rd and 66th percentiles of the training data
Use case: When you want to discretize continuous features for tree-based models or categorical analysis.
9. Speaker-wise Scaling (speaker)
Per-speaker normalization - applies standard scaling individually for each speaker.
[FEATS]
scale = speaker
Use case: When speaker-specific characteristics should be normalized, useful for speaker-independent emotion recognition.
Configuration
Quick Start Demo
To quickly test scaling techniques, you can use the provided demo example:
# Clone the repository and navigate to it
cd nkululeko
# Run a single scaling demo with standard scaling
python -m nkululeko.nkululeko --config examples/exp_scaling_demo.ini
# Or run all scaling methods systematically
bash scripts/run_scaler_experiments.sh
The systematic script will test all 9 scaling methods and provide a comprehensive comparison of their performance on your dataset.
Basic Configuration
Add the scaling configuration to the [FEATS] section of your INI file:
[FEATS]
type = ['os'] # Feature type
set = eGeMAPSv02 # Feature set
scale = standard # Scaling method
Advanced Configuration Examples
Robust scaling with OpenSMILE features:
[FEATS]
type = ['os']
set = ComParE_2016
level = functionals
scale = robust
Min-max scaling for neural networks:
[FEATS]
type = ['spectra']
scale = minmax
[MODEL]
type = cnn
Speaker-wise normalization:
[FEATS]
type = ['os']
scale = speaker
[DATA]
# Ensure speaker information is available
target = emotion
Binning for tree-based models:
[FEATS]
type = ['os']
scale = bins
[MODEL]
type = xgb
Usage Examples
Complete Experiment Configuration
[EXP]
root = ./experiments/
name = emotion_recognition_robust_scaling
type = classification
[DATA]
databases = ['emodb']
emodb = /path/to/emodb
target = emotion
labels = ['anger', 'happiness', 'neutral', 'sadness']
[FEATS]
type = ['os']
set = eGeMAPSv02
level = functionals
scale = robust # Using robust scaling for outlier resistance
[MODEL]
type = svm
C_val = 1.0
kernel = rbf
Comparing Different Scaling Methods
You can compare different scaling methods using the automated script or manually:
Automated Comparison (Recommended)
# Run all scaling methods on your dataset
bash scripts/run_scaler_experiments.sh
This script will:
Test all 9 scaling methods automatically
Generate individual configuration files
Run experiments with consistent settings
Provide a summary comparison of results
Clean up temporary files
Manual Comparison
You can also compare different scaling methods by running separate experiments:
Experiment 1: Standard scaling
[EXP]
name = emotion_standard_scaling
[FEATS]
scale = standard
Experiment 2: Robust scaling
[EXP]
name = emotion_robust_scaling
[FEATS]
scale = robust
Experiment 3: Min-max scaling
[EXP]
name = emotion_minmax_scaling
[FEATS]
scale = minmax
Using the FLAGS Module for Comparison
For systematic comparison within a single run:
[EXP]
root = ./results/scaling_comparison/
name = comprehensive_scaling_study
[DATA]
databases = ['mydata']
mydata = ./data/mydata.csv
target = emotion
[FEATS]
type = ['os']
[MODEL]
type = ['xgb']
[FLAGS]
scale = ['standard', 'robust', 'minmax', 'maxabs', 'normalizer', 'powertransformer', 'quantiletransformer', 'bins']
Understanding Scaling Results
When you run the scaling experiments script, you’ll see output like this:
Starting scaling experiments...
===============================
Current directory: /path/to/nkululeko
Examples path: ./examples
Results path: ./examples/results
Checking data availability...
✓ Polish dataset found - using full dataset
Running experiment with scaling method: standard
=================================================
Config file created: ./examples/results/temp_scaling_configs/exp_scaling_standard.ini
Starting experiment...
✓ SUCCESS: standard scaling completed
Result: best result: 0.75
Running experiment with scaling method: robust
===============================================
...
========================================
All scaling experiments completed!
Success: 9/9
========================================
Quick Results Comparison:
========================
standard : 0.75
robust : 0.78
minmax : 0.72
maxabs : 0.74
normalizer : 0.69
powertransformer : 0.76
quantiletransformer : 0.77
bins : 0.71
speaker : 0.73
Interpreting Results
Higher scores indicate better performance (accuracy for classification)
Robust scaling often performs well with real-world audio data due to outlier resistance
Standard scaling is a reliable baseline
Bins scaling may show different results as it converts to categorical features
Speaker scaling is useful when speaker variability is a concern
Result Files
The script generates several output files:
scaling_experiments_summary.txt: Complete summary with timestamps and method descriptionsIndividual log files:
exp_scaling_[method].logfor detailed experiment logsResult plots (if configured): Visual comparisons of scaling effects
Best Practices
1. Choosing the Right Scaling Method
Data Characteristics |
Recommended Scaler |
Reason |
|---|---|---|
Normal distribution, few outliers |
|
Classical z-score normalization |
Contains outliers |
|
Uses median/IQR, less sensitive to outliers |
Need bounded range [0,1] |
|
Explicit range control |
Sparse data |
|
Preserves sparsity |
Skewed distributions |
|
Makes data more Gaussian |
Many outliers |
|
Robust distribution mapping |
Tree-based models |
|
Can improve interpretability |
Speaker variability |
|
Normalizes per-speaker differences |
2. Neural Network Considerations
For neural networks, consider:
minmaxfor bounded inputsstandardfor well-behaved distributionsAvoid
binsas neural networks work better with continuous features
3. SVM Considerations
SVMs benefit from scaled features:
standardorrobustare typically good choicesminmaxensures all features contribute equally
4. Tree-based Model Considerations
Tree-based models (XGBoost, Random Forest) are generally scale-invariant:
Scaling may not be necessary
binscan improve interpretabilityStandard scaling doesn’t hurt and may help with some implementations
5. Cross-database Experiments
When working with multiple databases:
Ensure consistent scaling across databases
robustorquantiletransformermay be more stable across different recording conditions
API Reference
Scaler Class
class Scaler:
"""Class to normalize speech features."""
def __init__(self, train_data_df, test_data_df, train_feats, test_feats,
scaler_type, dev_x=None, dev_y=None):
"""
Initialize the scaler.
Parameters:
-----------
train_data_df : pd.DataFrame
Training dataframe with speaker information (needed for speaker scaling)
test_data_df : pd.DataFrame
Test dataframe with speaker information
train_feats : pd.DataFrame
Training features dataframe
test_feats : pd.DataFrame
Test features dataframe
scaler_type : str
Type of scaling: 'standard', 'robust', 'minmax', 'maxabs',
'normalizer', 'powertransformer', 'quantiletransformer', 'bins', 'speaker'
dev_x : pd.DataFrame, optional
Development data dataframe
dev_y : pd.DataFrame, optional
Development features dataframe
"""
def scale(self):
"""
Scale features based on the configured scaling method.
Returns:
--------
tuple
(train_scaled, test_scaled) or (train_scaled, dev_scaled, test_scaled)
"""
def scale_all(self):
"""Scale all datasets using the configured scaler."""
def speaker_scale(self):
"""Apply speaker-wise scaling."""
def bin_to_three(self):
"""Convert features to three bins: low, medium, high."""
Key Methods
scale()
Main method that applies the selected scaling strategy.
scale_all()
Handles scaling for non-speaker-specific methods.
speaker_scale()
Applies scaling per speaker for speaker-wise normalization.
bin_to_three()
Implements the binning strategy, converting continuous features to categorical bins.
Return Values
The scaler returns scaled DataFrames in the same format as the input:
Same indices as input features
Same column names as input features
Scaled/transformed values according to the selected method
For the bins method, values are returned as strings: “0”, “0.5”, “1”.
Error Handling
The scaler includes robust error handling:
# Invalid scaler type
scaler = Scaler(..., scaler_type="invalid")
# Raises: ValueError with message about unknown scaler
# Missing speaker information for speaker scaling
# Will raise appropriate error if speaker column is missing
Integration with Nkululeko Pipeline
The scaler is automatically integrated into the Nkululeko pipeline:
Features are extracted according to
[FEATS]configurationScaler is applied if
scaleparameter is specifiedScaled features are passed to the model for training/testing
No manual intervention is required - just specify the scaling method in your INI file.
Script Usage and Examples
Running the Scaling Experiments Script
The run_scaler_experiments.sh script provides an automated way to test all scaling methods:
# From nkululeko root directory
bash scripts/run_scaler_experiments.sh
# From scripts directory
cd scripts
bash run_scaler_experiments.sh
Script Features
Automatic dataset detection: Uses Polish dataset if available, falls back to test dataset
Dynamic configuration: Creates temporary config files for each scaling method
Comprehensive logging: Individual log files for each experiment
Results summary: Consolidated summary with performance comparison
Error handling: Continues with other methods if one fails
Cleanup: Removes temporary files after completion
Script Output Files
File |
Description |
|---|---|
|
Main summary with all results and timestamps |
|
Detailed log for each scaling method |
|
Model outputs and plots (if save=True) |
Customizing the Script
You can modify the script to:
Change the dataset: Edit the config creation functions
Add custom scaling methods: Extend the
scaling_methodsarrayModify experiment parameters: Update epochs, runs, or model type
Change feature types: Modify the
[FEATS]section in config templates
Example customization for different features:
# Edit the create_scaling_config function to use different features
[FEATS]
type = ['praat'] # Instead of ['os']
scale = ${method}
For more information about feature extraction and model configuration, see: