In the settings for your custom classifier you can find options to change the language.
This setting should match the language in your text data. Currently we support the following languages:
English
Dutch
French
German
Italian
Portuguese
Spanish
Russian
Chinese
Japanese
Korean
Arabic
Danish
Swedish
Romanian
Hungarian
Finnish
Norwegian
Other / Multi-language
Selecting the correct language is important, MonkeyLearn uses this information for the stemming and tokenization process, and for the default stopwords selection.
If we don’t support the right language for your data yet, you should try with the Other / Multi-language option. You can get very good results without stemming and you can override the default stopwords with your own if you need to.
If your text data features more than one language, for example for a language detection classifier, the Other / Multi-language is the right option for you.