Multi-user labeling
The managed version of refinery offers multi-user support, which is generally recommended for the highest label quality as a majority voting among multiple domain experts usually yield better results than a single individual labeling everything. With multiple users also come more requirements, like quantifying general disagreement and quick ways to solve conflicts.
Solving conflicts by selecting gold labels
When you and a colleague have opposing opinions on labeling a record, which label should be considered by refinery for accuracy calculation, monitoring, or training purposes? At the moment, none of those labels will be considered for these tasks because the system cannot decide which is the correct one. To address this issue, refinery introduces the concept of gold labels (sometimes called gold star labels).
The gold label is a special label for refinery because it will always be prioritized over regular labels. In the single-user application, there is no need for this distinction, but it is necessary for resolving conflicts. Currently, there is no automated way of calculating gold star labels (e.g. majority vote) as we found that it is often times necessary to discuss the labeling conflicts with your domain experts first in order to clear up any confusion about the task.
There are two ways of assigning a gold label:
- If at least two different people assigned conflicting labels, an empty star icon will appear next to the labeling task where the conflict occurs. In order to select a gold label, first make sure you selected the right view (e.g. your own view as your colleague made a mistake) by clicking on the corresponding avatar and then clicking on the star icon. After that, the star icon will be filled.
- As soon as there is a gold label, you can access the gold label view to directly edit the gold labels. This is especially useful for conflicts in extraction tasks, as oftentimes a combination of labels from different users is necessary to correctly label the record.
Sharing labeling sessions with engineers
Currently, if you want a colleague to have a look at your data, there is no way of sharing an exact record. You could inform your colleague that they need to look for the 30th record in a certain data slice, but that is not really intuitive. That is why we recommend sharing labeling sessions. They work differently depending on the role that the user
As explained in the separate labeling session section, you can share your sessions by sharing the URL with a colleague. The colleague will then start at the exact position that you shared with the same records in the same order. Important is that this person also has access to your project, so they must be in the same organization in refinery. Also, this just works
Labeling as an Expert
Oftentimes the engineer is not the person doing most of the labeling work. Instead, the engineer can create static data slices that the experts can then label themselves. They will have a much more restricted view of refinery as they cannot access anything beyond the labeling suite. For more information on the different roles, please look at the managing roles page of this documentation.
Labels by experts are treated as regular labels, just like the ones of an engineer.
An engineer can temporarily view a data slice as an expert to confirm that this is what they intended. In order to do that, navigate to the data browser and click on the information icon next to a static data slice. There you will see a link that lets you take the temporary view of an expert (see Fig. 4). This is also the link that you can share directly with your expert colleagues.
Labeling as an Annotator
In cases where you want to assign labeling work to people who might lack domain expertise, you can use the crowd-labeling heuristic of refinery. The annotators will have the same labeling suite view as the experts with the exception that they cannot freely select data slices. Instead, they can only work on the heuristic that was manually assigned to them.