Embedding integration

Integrating language models from 🤗 Hugging Face

Depending on your task at hand, one of the first things you can do is to pick one (or multiple) embeddings for your data. If you're not familiar with embeddings yet, make sure to take a look at our blog. We explain how embeddings work and how you can benefit from using them.

To create one, simply click on "Generate embedding" on the project settings page. A modal will open up, asking you for the following information:

  • target attribute: If you have multiple textual attributes, it makes sense to compute embeddings for them step-by-step. Here you can choose the attribute you want to encode.
  • granularity: You can both calculate the embeddings on the whole attribute (e.g. a sentence) or for each token. The latter option is helpful for extraction tasks, whereas attribute-embeddings help you both for classification tasks and neural search. We recommend that you always begin with attribute-level embeddings.
  • configuration: This defines the model to use. You can choose from the recommended options, or type in any configuration string from Hugging Face.

On the managed version, the embedding creation is calculated on a GPU-accelerated instance. Generally, this process might take some time, so it might be the time to grab a hot coffee? ☕

Once the computation is finished, the embedding is usable for active learning (and in the case of attribute-level embeddings, for neural search).