Custom extractors are useful to train a machine learning model to extract pieces of data from a series of texts. The data can be whatever you define: email addresses, names, products, etc., The following video and guide will show you how you can teach a model to extract the kind of data you are looking for.

Five steps to build a Custom Extractor

  1. Start here to build a custom model, and then click "Extractor".

2. Import your text data by uploading files directly, connecting with an outside app, or try out a sample data set.

3. Specify the data that will help train the model by selecting the columns with the text samples. If you select multiple columns the data will be concatenated, or joined together.

4. Define the tags (or pieces of information) that you will use for the extractor. At least one tag is needed, more can be added later. See our tag reference here.

5. Tag the words that appear by selecting the tags on the right and clicking on the text that they represent. This will help train the model. A given number of texts are needed to train the model.

👉 After annotating tags in some texts, the model will learn from your tags and begin to make suggestions automatically.

️Using a Trained Extractor

Your extraction model is now trained with the training samples and entities you provided. It can be used to extract new text, or you can train it further.

Processing Text under the "Run" Tab

You can test your extractor by pasting in new text and clicking "Extract Text". The tags extracted will be listed on the right under "Results" with the associated value.

Or run a batch process by uploading a file with new samples directly.

🔍 Building Further Accuracy 

Go to the "Build" tab to see options to further train the model by annotating more texts.

In Data, you can see all your text data, filter by tags, and select texts to make changes to previously annotated texts and perform bulk operations.

Did this answer your question?