Don’t Start from Scratch When Building Machine Learning Models
How Leveraging Training Data with Pre-Trained Models Can Accelerate Your AI Projects
It would probably be a safe bet to assume that most of the AI you’ve interacted with was built using supervised learning. Supervised learning, which is essentially building machine learning (ML) models from scratch, has been the key driver of artificial intelligence (AI) development so far, motivated by increased access to large datasets and growth in computing power. But with many AI projects never reaching fruition due to lack of resources, one might hope there’d be a more efficient method for model creation. Fortunately, there are alternatives to supervised learning that cut down on time, money, and human effort without sacrificing quality.
Leveraging your own training data for transfer learning and using pre-trained models is a machine learning technique that’s only recently starting to gain traction as technologists seek out new ways to optimize ML models. Transfer learning doesn’t require starting from scratch, and can lower the initial investment in launching AI. With transfer learning, ML becomes more widely available, enabling more companies to launch their AI projects and overall increasing the acceleration of AI adoption.
What is Transfer Learning?
Transfer learning is an ML method where a model trained for a task is used as the starting point for solving a different, but related task. This pre-trained model is likely not going to be 100% accurate for the new task, so pruning the model and training it on data designer for your use cases to fine tune is often needed. For example, you may have a model trained to identify house cats. Transfer learning would involve reusing that model and fine tuning it to identify, say, bobcats as well.
How to Use Pre-trained Models
Transfer learning using pre-trained models will follow a process like this:
1. Select Model
Model selection is a critical first step in transfer learning. You’ll want to select a model that closely resembles the use case you’re trying to solve. There are numerous models available either free and open-source, or for purchase form a third-party vendor. NVIDIA, for instance, offers a Transfer Learning Toolkit that includes a wide range of pre-trained models in facial recognition, object detection, and many other common ML use cases.
Model quality will vary depending on the source, so do your due diligence in ensuring you’re selecting a model that achieves your desired quality standards.
2. Prune Model
Choose which attributes (perhaps all of them) of the source model you’d like to leverage for your new task. If you’re only using parts of the model, you may consider using just the model architecture or in the case of a neural network, only certain layers in the network. This choice depends on the nature of the problem you’re trying to solve, as well as the model type you are working with. You may also continue to prune your model after completing step 3, if necessary.
3. Train Model
To maximize performance, you’ll want to continue to fine tune your model and confirm its precision; this requires additional training data for your current use case. You may have your own datasets already that you want to use for training purposes. If you need annotation work for your data, you may want to seek out a third-party data provider, like Appen, that can give you instant access to a pool of annotators and a data annotation platform for efficient labeling.
If you need to source additional data, a data provider like Appen can provide labeled datasets for you as well. You may continue to train your model on new data until the model reaches your required performance levels. Setting up a strong training data pipeline will make this step faster and more scalable in the long-term, especially considering models require regular retraining after deployment.
Why Use Transfer Learning and Pre-trained Models?
The resources required to build and train ML models from scratch are tremendous. First, you need a team of highly-specialized data scientists and ML experts, plus data annotators with domain expertise. You need a ton of data, which takes time to collect and costs money. You need additional time to label your data, program your algorithm, train it on your labeled data, test it, deploy it, and continue to monitor it post-production. Altogether, building ML from the ground up is an incredibly resource-intensive endeavor.
When implemented correctly, transfer learning saves time and still achieves desired performance. Leveraging a pre-trained model can mean many things: it means you don’t need to label an entire dataset for training purposes (but you’ll probably still need to label some data—more on that in the previous section). It also means you may not need a data scientist or ML expert on your team because you’re not actually building a new model, which is the area of AI development that requires the most specialized skills. In an era where a skills gap in AI and ML professionals persists, this is a critical factor of differentiation for transfer learning.
Transfer learning is an excellent tool for when the task you’re trying to solve may not have a ton of available data, but a related task does. You can then use the knowledge gained from solving the related task to solve the new one.
Transfer learning is one of the most popular ways that humans gain new information, so it only makes sense to leverage that process in AI as well. Imagine that companies no longer need to hire for highly-specialized positions in ML, that they can launch high-quality AI products quickly, and that they no longer need to invest an incredible amount of time, money, and effort into each AI initiative. Transfer learning opens up opportunities for more players to enter the AI scene, ultimately prompting greater experimentation and innovation in the space.