Confusion matrix

Per default, Nkululeko displays results as a confusion matrix using binning with regression.

Confusion matrix

Epoch progression

The point when overfitting starts can sometimes be seen by looking at the results per epoch:

Epoch progression

Feature importance

Using the explore interface, Nkululeko analyses the importance of acoustic features:

Feature importance

Feature distribution

And can show the distribution of specific features per category:

Feature distribution

If there are only two categories, a Mann-Whitney U test for significance is given:

Feature distribution for two categories

t-SNE plots

A t-SNE plot can give you an estimate of whether your acoustic features are useful at all. Your feature space gets tranferred into 2 or 3-dimensional space and the color of the blobs reflect your target class distribution. Beneath t-SNE, you can choose from UMAP and PCA.

t-SNE plot

Data distribution

Sometimes, you only want to take a look at your data. In this example, you see that the biological sex of the speakers is approximately uniformly distributed with respect to some emotional speech database.

Data distribution

Bias checking

In some cases, you might wonder if there’s bias in your data. You can try to detect this with automatically estimated speech properties by visualizing the correlation of target labels and predicted labels.

Bias checking

Uncertainty

Nkululeko estimates the uncertainty of model decisions (only for classifiers) with entropy over the class probabilities or logits per sample.

Uncertainty