Active learners

Build few-shot learning models

Active learning for classification


Also available on 📺 YouTube

Click here to see a video explanation of how you can build active learners.

Just as you can write labeling functions for your labeling automation, you can also easily integrate active learners. To do so, head to the heuristics overview page and select "Active learning" from the "New heuristic" button.


Similar to the labeling function editor, a coding interface will appear with pre-entered data. Once you made sure that you have the right labeling task selected, you can pick an embedding from the purple badges right above the editor. If you click on them, their configuration will be copied to your clipboard, such that you can enter the name into the value embedding_name of decorator @params_fit.


You can use Scikit-Learn inside the editor as you like, e.g. to extend your model with grid search. The self.model is any model that fits the Scikit-Learn estimator interface, i.e. you can also write code like this:

from sklearn.model_selection import GridSearchCV
from sklearn.tree import DecisionTreeClassifier

class ActiveDecisionTree(LearningClassifier):

    def __init__(self):
        params = {
            "criterion": ["gini", "entropy"],
            "max_depth": [5, 10, None]
        self.model = GridSearchCV(DecisionTreeClassifier(), params, cv=3)

# ...

As with any other heuristic, your function will automatically and continuously be evaluated against the data you label manually.

Minimum confidence for finetuning

One way to improve the precision of your heuristics is to label more data (also, there typically is a steep learning curve, in the beginning, so make sure to label at least some records). Another way is to increase the min_confidence threshold of the @params_inference decorator. Generally, precision beats recall in active learners for weak supervision, so it is perfectly fine to choose higher values for the minimum confidence.

Active learning for extraction

We're using our own library sequencelearn to enable a Scikit-Learn-like API for programming span predictors, which you can also use outside of our application.

Other than importing a different library, the logic works analog to active learning classifiers.