ML Techniques: Active Learning vs Weak Supervision
A Comparison of Two Popular Machine Learning Techniques
Machine learning (ML) has grown exponentially as a field, but a familiar roadblock remains for many businesses: data. Training ML algorithms traditionally requires enormous amounts of manually-labeled data. The sheer size of data needed is often not available at scale and costly, not to mention the time and effort required to label it by hand. Data that is readily available is often short of desired quality standards. Active learning vs weak supervision: two great ML techniques you can leverage to overcome the data challenge.
Labeling that data also requires human labelers – and in many cases, those labelers are subject matter experts (SMEs) to some degree – who can use their domain knowledge to make accurate annotations. But SMEs are both limited in availability and expensive to employ.
With all of these challenges in mind, teams launching artificial intelligence (AI) solutions turn away from fully supervised learning (which requires complete, hand-labeled datasets for training ML models) to active learning and weak supervision. The latter learning techniques are generally faster and less labor-intensive while still capable of training models successfully. Understanding how they work and the benefits each type offers will help you decide if weak supervision or active learning (or a combination of both) may be the right training solution for your model.
Active Learning vs Weak Supervision: How They Fit into Supervised Learning
It’s important to recognize that there are different types of learning in ML, and they all fall under one of two categories: supervised and unsupervised. With supervised learning, the machine receives data points labeled by humans and uses those to make predictions. On the other hand, unsupervised data uses unlabeled data; the algorithm must extract structure and patterns from the data without human guidance.
Under the supervised learning umbrella, there is a spectrum of learning types. On this spectrum, we find active learning, a form of semi-supervised learning, and weak supervision.
Active learning is a form of semi-supervised learning. Unlike fully supervised learning, the ML algorithm is only given an initial subset of human-labeled data out of a larger, unlabeled dataset. The algorithm processes that data and provides a prediction with a certain confidence level. Anything below that confidence level will signal that more data is needed. These low-confidence predictions will be sent to a person to label the requested data and provide it back to the algorithm. The cycle repeats until the algorithm is trained and operating at desired prediction accuracy. This iterative human-in-the-loop method is built on the idea that not all samples are valuable for learning, so the algorithm chooses the data it learns from.
A key differentiator in active learning is the sampling method used, which significantly affects how the model performs. Data scientists can test different sampling methods to select the one that produces the most precise results. Overall, active learning relies less on data annotation by people compared to fully supervised learning because not all of the dataset requires annotation, only the data points requested by the machine.
Weak supervision is a learning technique that blends knowledge from various data sources, many of which are lower-quality or weak. These data sources could include:
Low-quality labeled data from cheaper, non-experts.
Higher-level supervision from SMEs, for example, using heuristics (rules). A heuristic might say something like, “If datapoint = x, then label it as y.” Using a heuristic or set of heuristics can instantly label thousands, even millions, of data points.
Pre-trained, old models, which may be biased or noisy.
The data in these sources is often inexact (the data has labels, but the labels aren’t as accurate as desired) or inaccurate (a portion of the labels have errors). You can program your model to use simple techniques, or labeling functions, like pattern recognition, to learn from your collected dataset. Then, reach more ideal weights by tuning your features and hyperparameters until your model achieves the desired performance. If needed, incorporate a smaller, supervised dataset to complete your model’s training.
Weak supervision is a way of programming training data to reduce the amount of time required for humans to label data manually. It’s best for classification tasks, times when you have an unlabeled dataset to manage, or when your use case specifically allows you to use weak label sources.
What are the Differences Between Active Learning and Weak Supervision?
Both types of learning can produce high-performing models, but they’re notably different in several key ways:
Source of Labels
The labels required for each type of learning are sourced very differently:
Humans (usually SMEs) label the dataset.
The labels are assumed to be accurate.
The labels come from one source.
Sources are flexible and come from any number of places.
Labels aren’t necessarily very accurate or complete.
Multiple data sources must be used.
The ratio of time, money, and people invested for each type of learning differs:
Using SMEs for labeling purposes is expensive, as they require payment and have limited availability.
Active learning requires humans to spend time labeling at least a portion of the data in a dataset.
Labeling functions can be applied to millions of data points in seconds, saving tremendous amounts of time with labeling.
The time invested in weak supervision training varies depending on the data sources but is generally less than what’s needed for an active learning project.
While machine learning is always an iterative process, the amount of iteration varies with weak supervision vs active learning:
Uses a human-in-the-loop iterative process of many cycles.
The model is trained as data is labeled.
The datasets are fully labeled prior to the start of training the model.
There’s no human-in-the-loop baked into the training process.
The Benefits of Both Approaches
Despite their differences, both active learning and weak supervision are still a departure from fully supervised learning. In that, they have the benefit of time saved on the enormous task of labeling and money saved through limiting the work of SMEs. With weak supervision, the volume of expensive data you need will be much less than what would be required under supervised learning. Similarly, if you have an effective sampling technique with active learning, you can achieve quality model performance with fewer labeled data points than you would need from a traditional approach.
Most importantly, there’s no one-size-fits-all approach to machine learning. Selecting one type of learning or the other will depend on your available allotment of time, money, and people; your plan for collecting data and where that data will be sourced; and your specific use cases. Depending on that particular use case, it doesn’t have to be active learning vs weak supervision – they are not always mutually exclusive, depending on your scenario. Use these factors to guide you in selecting a learning technique that works best for your AI solution.
To see how we can help you train your models, check out our solutions to showcase what we deliver and how we do it.