# Nkululeko Overview Nkululeko is a Python-based framework designed for speaker characteristics detection from audio data. Its primary purpose is to enable researchers and engineers to train, evaluate, and deploy machine learning models for detecting characteristics such as emotion, age, gender, or speech disorders from audio samples. ## Purpose and Capabilities The framework provides a high-level interface that allows users with limited programming experience to configure and run experiments through INI configuration files rather than writing extensive code. It enables users to: 1. Load and preprocess audio data from various sources 2. Extract a wide range of acoustic features using multiple techniques 3. Train different machine learning models for classification or regression tasks 4. Evaluate model performance using appropriate metrics 5. Generate reports and visualizations — confusion matrices, per-class text reports, and prediction CSVs 6. **Test a trained model on a new database automatically**: when `DATA.tests` is set in the configuration and a saved experiment already exists, `nkululeko.nkululeko` skips training and evaluates the best stored model on the new database directly (see [test_new_database.md](test_new_database.md)) 7. **Predict labels for arbitrary audio** using the unified `nkululeko.predict` module — supports single files, folders, CSV lists, and live microphone input, using either built-in autopredict targets (age, gender, emotion, SNR, …) or the best model from a trained experiment (see [predict.md](predict.md)) 8. Investigate and mitigate potential biases in training data Nkululeko serves speech processing researchers, machine learning practitioners, and developers working on audio-based applications, allowing them to focus on experimentation rather than implementation details. ## Core Systems The project is organized around a modular architecture with several core systems: 1. **Experiment Framework**: The central system that orchestrates the entire experiment lifecycle. 2. **Data Processing**: Handles loading, filtering, and splitting datasets. 3. **Feature Extraction**: Extracts acoustic features from audio using various methods. 4. **Model Training**: Trains and evaluates machine learning models. 5. **Reporting**: Generates visualization and performance reports. 6. **Command-Line Interface**: Provides multiple entry points for different functionalities. ## Target Audience Nkululeko is designed for: - Speech processing researchers - Machine learning practitioners - Audio application developers - Students learning about speech processing - Anyone interested in speaker characteristics detection without extensive programming knowledge By providing a high-level interface through configuration files, Nkululeko makes sophisticated audio analysis accessible to users with varying levels of technical expertise.