Equal Error Rate (EER) Implementation for Nkululeko
Overview
This implementation adds Equal Error Rate (EER) as a metric for binary classification tasks in nkululeko, addressing issue in https://github.com/bagustris/nkululeko/issues/15. EER is particularly useful for biometric systems and deepfake detection tasks like the Fake-or-Real (FoR) dataset.
What is EER?
Equal Error Rate (EER) is the point where the False Acceptance Rate (FAR) equals the False Rejection Rate (FRR) in a binary classification system. It’s commonly used in:
Biometric authentication systems
Speaker verification
Deepfake/synthetic speech detection
Security systems
Lower EER values indicate better performance (range: 0-1, where 0 is perfect).
Implementation Details
Files Modified
nkululeko/reporting/reporter.py
Added
equal_error_rate()functionUpdated
_set_metric()to support EERModified
_get_test_result()to calculate EERUpdated
__init__()to compute both EER and UAR when EER is selectedModified plotting functions to display both EER and UAR
ini_file.md
Updated documentation for the
measureparameterAdded EER as an option for classification tasks
New Files Created
tests/test_eer.py
Unit tests for EER calculation
Validates EER with different classification scenarios
data/for-2sec/exp_eer.ini
Example configuration using EER metric
Demonstrates usage with FoR-2sec deepfake detection dataset
Usage
In Configuration Files
Add the following to your INI file’s [MODEL] section:
[MODEL]
type = xgb # or any classifier
measure = eer
Example
See data/for-2sec/exp_eer.ini for a complete example:
python -m nkululeko.nkululeko --config data/for-2sec/exp_eer.ini
Key Features
Dual Reporting: When EER is selected as the measure, both EER and UAR are reported
Confidence Intervals: EER is calculated with bootstrap confidence intervals (same as other metrics)
Probability-Based: EER uses class probabilities when available for accurate calculation
Fallback Handling: Gracefully handles cases where probabilities are not available
Output Format
When using EER, the output will show:
Confusion matrix result for epoch: 0, EER: 0.123, (+-0.015/0.018), UAR: 0.876, ACC: 0.892
Confusion matrix plots will display both EER and UAR in the title.
Testing
Run the unit tests:
PYTHONPATH=/home/bagus/github/nkululeko:$PYTHONPATH python tests/test_eer.py
Technical Notes
EER requires binary classification (2 classes)
Best used with models that output probabilities (SVM with
probability=True, XGBoost, neural networks)The implementation finds the threshold where FAR = FRR by minimizing |FAR - FRR|
Returns the average of FAR and FNR at the optimal threshold
References
TorchMetrics EER: https://lightning.ai/docs/torchmetrics/stable/classification/eer.html
Issue #15: https://github.com/bagustris/nkululeko/issues/15
Future Enhancements
Potential improvements:
Support for multi-class EER (one-vs-rest)
ROC curve plotting with EER threshold marked
DET (Detection Error Tradeoff) curve visualization