The following is a list of public extractor models that you can use with your MonkeyLearn account.
They can be used for both manual processing (uploading files directly) or through our API, Zapier, via the Google Sheets Extension, or RapidMiner. Please see integrations for more details.
Keyword Extractor (English) - Extract the most relevant words and expressions from text in English. Keywords can be compounded by one or more words and are defined as the important topics in your content and can be used to index data, generate tag clouds, or for searching.
Company Extractor (English) - Extract company and organization entities from text in English.
Person Extractor (English) - Extract person names from text in English.
Location Extractor (English) - Extract locations from text in English.
Email Thread Cleaning
Email Cleaner & Last Reply Extractor - Extract the last reply from an email thread. A cleaning model that removes signatures, confidentiality clauses, and other replies from email threads. It relies on statistical algorithms and natural language processing technology to analyze your emails and generate a cleaned version that captures the actual message. The targeted language is English.
Web to Text
Boilerplate Extractor (English) - Extract relevant text from HTML. This algorithm can be used to detect and remove the surplus "clutter" (boilerplate, templates) around the main textual content of a web page.
Html to Text Extractor - Converts a page of HTML into clean, easy-to-read plain ASCII text.
Opinion Unit Extractor (English) - Extracts opinion units from a given text. Useful to separate paragraphs or sentences into smaller pieces of data.
Summary and Insight
Insight Extractor (English) - Extract the most important insights from text in English. Given a text, the output will be the most important keywords in the text. Each keyword will include the most representative sentences where it appears. This model is useful for things like seeing what most users are saying about a product or place — simply send all the reviews concatenated as one text.
Summary Extractor (English) - Given a text, the output will be a shorter version of it that maintains its meaning. This summarization model employs statistical algorithms and natural language processing technology to analyze your content and generate a summary that preserves the gist of the original. No new sentences are generated; every sentence of the summary is present in the original text.
Sentence Extractor (English) - Extracts the sentences from a given text. Useful to separate articles or paragraphs into smaller pieces of data.
Date and Time Extractor - Extracts dates and times from text, and outputs them in ISO format. If a date contains a time, they will be extracted together. When any element of the date is missing (such as the year), the current date is assumed. This base date can be specified as well.
Price Extractor - Extract prices in different currencies from text. The number and currency are returned separately for more convenient parsing.
Email Extractor - Extract email addresses from text.
Phone Number Extractor - Extracts North American phone numbers from text and returns them with unified formatting. All the numbers extracted will be valid under the North American Numbering Plan, which means they can be from the US, from Canada, or from certain Caribbean countries.
URL Extractor - Extract URLs from text.