Multi-user labeling

The managed version of refinery offers multi-user support, which is generally recommended for the highest label quality as a majority voting among multiple domain experts usually yield better results than a single individual labeling everything. With multiple users also come more requirements, like quantifying general disagreement and quick ways to solve conflicts.

Fig. 1: Screenshot of the labeling suite where the displayed record was labeled by two different users, which is indicated by the two avatars next to the 'record IDE' button on the top left. Selecting an avatar lets you inspect the labels that they assigned to this record.

Solving conflicts by selecting gold labels

When you and a colleague have opposing opinions on labeling a record, which label should be considered by refinery for accuracy calculation, monitoring, or training purposes? At the moment, none of those labels will be considered for these tasks because the system cannot decide which is the correct one. To address this issue, refinery introduces the concept of gold labels (sometimes called gold star labels).

The gold label is a special label for refinery because it will always be prioritized over regular labels. In the single-user application, there is no need for this distinction, but it is necessary for resolving conflicts. Currently, there is no automated way of calculating gold star labels (e.g. majority vote) as we found that it is often times necessary to discuss the labeling conflicts with your domain experts first in order to clear up any confusion about the task.

Fig. 2: GIF of a user looking at the label of their colleague and deciding it must have been a mistake. In order to solve the conflict, they select their own label as the gold star label. When accessing the gold label view, a modal explains why you should treat it with care.

There are two ways of assigning a gold label:

  • If at least two different people assigned conflicting labels, an empty star icon will appear next to the labeling task where the conflict occurs. In order to select a gold label, first make sure you selected the right view (e.g. your own view as your colleague made a mistake) by clicking on the corresponding avatar and then clicking on the star icon. After that, the star icon will be filled.
  • As soon as there is a gold label, you can access the gold label view to directly edit the gold labels. This is especially useful for conflicts in extraction tasks, as oftentimes a combination of labels from different users is necessary to correctly label the record.

Sharing labeling sessions with engineers

Currently, if you want a colleague to have a look at your data, there is no way of sharing an exact record. You could inform your colleague that they need to look for the 30th record in a certain data slice, but that is not really intuitive. That is why we recommend sharing labeling sessions. They work differently depending on the role that the user

As explained in the separate labeling session section, you can share your sessions by sharing the URL with a colleague. The colleague will then start at the exact position that you shared with the same records in the same order. Important is that this person also has access to your project, so they must be in the same organization in refinery. Also, this just works

Labeling as an Expert

Oftentimes the engineer is not the person doing most of the labeling work. Instead, the engineer can create static data slices that the experts can then label themselves. They will have a much more restricted view of refinery as they cannot access anything beyond the labeling suite. For more information on the different roles, please look at the managing roles page of this documentation.

Fig. 3: Screenshot of the labeling suite view of a user with the 'Expert' role. They can freely select available static data slices to work on.

Labels by experts are treated as regular labels, just like the ones of an engineer.

An engineer can temporarily view a data slice as an expert to confirm that this is what they intended. In order to do that, navigate to the data browser and click on the information icon next to a static data slice. There you will see a link that lets you take the temporary view of an expert (see Fig. 4). This is also the link that you can share directly with your expert colleagues.

Fig. 4: GIF of an engineer accessing the temporary expert view of a static data slice.

Labeling as an Annotator

In cases where you want to assign labeling work to people who might lack domain expertise, you can use the crowd-labeling heuristic of refinery. The annotators will have the same labeling suite view as the experts with the exception that they cannot freely select data slices. Instead, they can only work on the heuristic that was manually assigned to them.

Fig. 5: Screenshot of the labeling suite view of an annotator. They can just select the crowd heuristic and do not even see what data slice is behind that.