Each custom model has two versions: sandbox and production.
- The sandbox version is the one you should use to make changes and train your model (e.g. remove/add new tags, tag more training samples, change the algorithm, etc).
- The production version is what you should use in production to analyze and tag incoming data.
What is Deploying to Production?
Deploying to Production is the process of copying the sandbox version of your model and save it as the production version of your model. You can do this process in the Settings of your model:
When to Deploy to Production?
Training a machine learning model involves a lot of experimentation, trying out different approaches, and seeing what things improve your model and what things actually hurt your model.
- Algorithm: does your model work better using Naive Bayes Or Support Vector Machines?
- N-gram range: what is the optimal n-gram range for your model? Does 'unigrams bigrams and trigrams' works better than just unigrams?
- Numbers of features: does a higher number of features improves accuracy?
- Categories: does removing a specific category improves the accuracy and F1 score of your model?
- Training data: does tagging a new set of examples improves your model?
Before you start to experiment with your model, you should deploy your model to production. This way, you make sure that any experimentations you do don't affect what you will be using in production to analyze new incoming data. You will always have this 'stable' model that you know it's working as expected.
Then, you can use the sandbox version of your model to experiment and try different approaches. If any approach is useful to improve your sandbox, you can deploy it to production, so the model you use in production can benefit from these improvements.
Sometimes, changes that you do on the sandbox are not necessarily an improvement. We strongly suggest deploying a model into production only when you are 100% sure the changes you have done in the sandbox of a model are an improvement.