Logo

Labeling tasks

Before you can label anything in refinery, you must first create a labeling task. Labeling tasks include information about the labeling target (full record or certain attribute), the type of the task, and the available labels. Each labeling task has a unique name that is used to identify it on other pages, e.g. in the labeling view and data browser.

Fig. 1: Screenshot of the settings page, which displays the data schema, embeddings, labelings tasks, and project metadata. There are two registered labeling tasks: indicators and topic.

Types of labeling tasks

The labeling task type defines the granularity of your labeling. Currently, we support these two options:

  • Multiclass classification: Gives you the option to assign the target record or attribute exactly one of the available labels. Good for downstream tasks like classification.
  • Information extraction: Gives you the option to assign any token of the selected attribute to exactly one label. Required the labeling task to be defined on an attribute, and not on the full record. Good for downstream tasks like named entity recognition, sentence segmentation, or part-of-speech tagging.

Creating labeling tasks

To add a labeling task, simply click on the "Add labeling task" button on the settings page.

A modal will open up, which asks you for the attribute you want to label. This selection will determine the available task types for later. If you want to label for classification and don't want to differentiate between single attributes, go with the full record option. After also providing a unique name for the task, you can now create the labeling task.

Fig. 2: Screenshot of the settings page where a user is adding a new labeling task called 'sentiment' on the target attribute 'headline'. This will give them the option for a classification or extraction task.

Deleting labeling tasks

Deleting a labeling task has far-reaching consequences as it is associated with labels, heuristics, and some filters in the data browser. If you delete this structure, the associations will be removed, too, which means that the labels and labeling efforts of that specific task will also be removed from the project.

If you are sure that you want to delete the labeling task, just click on the red trashcan icon on the very right of the labeling task on the settings page. There will be an explanatory modal that requires your confirmation (see Fig. 3).

Fig. 3: Screenshot of the settings page where the user clicked on the red trashcan icon in order to delete a labeling task, which triggered the confirmation modal that you can see in this screenshot.

Labels

Creating labels

Labels can be created at any time during the project, both on the settings page and while labeling your records. This gives you a lot of flexibility if requirements change during the project.

In order to add labels on the settings page, you just have to press on the "+" icon, which will open a modal where you must enter a unique label name for that task. So you could use the same label names for different tasks (as can be seen in Fig. 2). Users oftentimes want to add more than just a single label, which is why the modal stays open even after adding the label (shortcut confirm with enter key). That way you can add multiple labels really fast and when you're done, just close the modal with the "close" button (see Fig. 4).

Fig. 4: Screenshot of the settings page where the user clicked on the '+'-icon at the right side of a labeling task, which triggered this modal, where unique label names can be entered to create new labels.

You can also add labels while in the labeling suite (see Fig. 5) by typing the name of the new label into the search bar and pressing the "+" icon next to the search bar. This will add the new label to the available options, which you will have to manually select afterward in order to label your record.

Fig. 5: GIF of a user adding a new label to the labeling task 'topic' in the labeling suite.

Renaming labels

Sometimes, you might choose the wrong name for a label, or you just want to shorten it because it clutters your labeling view. To stay flexible throughout the project, you can rename labels from the settings page. In order to do that you have to click on the little color pipette icon on the left side of the label. A modal will appear that lets you customize your label with a color and a keyboard shortcut, but if you want to rename it, you have to click on the label itself at the very top of that modal (see Fig. 6).

Fig. 6: GIF of a user accessing the label renaming.

When renaming the label, refinery is aware that this label might have been used in heuristics, lookup lists, or other parts of your project. This is why there is a mandatory check before you can actually rename the label. This will display all the parts in refinery where this label name appears. Please keep in mind that we provide a "best guess" for these changes. Since custom written Python code is very versatile some changes might not be what you intended.

Fig. 7: Screenshot of the label renaming process after pressing the 'check rename' button. The displayed warnings remind you where the current label name is used.

Deleting labels

Deleting a label will also delete all the manually labeled data associated with it, as the given label would have no reference to a label and labeling task anymore. The other labels and tasks will be unaffected.

In order to delete a label, just go to the settings page and click on the little trashcan icon right next to it (not the one for the labeling task!). As this will delete all the manual labels associated with this label, there will be a modal asking for confirmation.

Quality of life

Label colors and keyboard shortcuts

You can customize your labels for more efficient labeling. If you want to change the color of your label, just click on the little pipette icon next to it on the settings page. That page also allows you to set a unique keyboard shortcut for that label. Just press the desired key, which will then be saved automatically. The chosen shortcuts will also be displayed on the settings and labeling suite.