Keep track of your project

Analyzing label quality and distribution

On our monitoring page, you can analyze the quality and distribution of your labels:

  • label distribution: a grouped bar chart showing you which label occurs how many times, grouped by whether the label has been set manually or via weak supervision.
  • confusion matrix: the go-to analysis for prediction quality, but instead of an actual classification model, we compare the manually labeled data with the weakly supervised labels.
  • inter-annotator agreement (only on the managed version): see how annotators agree and disagree, and how this impacts your label quality.

All of these graphs are on a labeling task basis, i.e., you can switch between different tasks.

The label distribution shows which label occurs how many times, grouped by the label source.


Our confusion matrix compares the weakly supervised labels with the manually set labels.


In the inter-annotator agreement matrix, you can see the quality of your label agreements, and whether certain users have different understandings of your labeled data.


Analyzing metrics on static slices

You can also reduce the record set that is being analyzed on the overview page by selecting a static data slice in the top right dropdown. This way, all graphs will be filtered, giving you deeper insights into your potential weak spots.


To learn more about data slices, read the next page about data management.

Did this page help you?