computer vision system

Want to Build a Better Computer Vision System? Give it the Right Training Data.

Computer vision (CV) is a machine’s ability to capture and analyze visual data on its own, then make decisions about it. With computer vision, machines can detect and recognize images, patterns, or objects. The artificial intelligence that powers computer vision system doesn’t just process images — it actually interprets the unstructured data it ingests. It’s not only seeing, but also perceiving visual data that enables a self-driving car to avoid accidents, an agricultural robot to pick fruit and sort potatoes, a television camera to automatically track a fast-moving baseball or hockey puck, and much more.   

Computer vision has been around for decades, but early efforts were crude, focusing on creating systems that could recognize shapes or edges. Limited hardware was an issue and researchers were still wrestling with the mathematical models on which to base their software algorithms. Computer vision has made huge strides in the last two decades, with new advances powered by faster hardware, better software, and, most importantly, machine learning and deep learning.

To be safe and effective, computer vision solutions must see and interpret images and videos with equal or greater accuracy than humans. So how does a CV algorithm, model, neural network, or system get smarter? It’s vital to train them huge on volumes of properly annotated images or videos, with clearly defined metadata. The result is more accurate detection and recognition of images and objects.

Emerging Trends & Use Cases for Computer Vision and Pattern Recognition

An Indian agritech startup, Imago AI, is using computer vision systems to automate the time- and labor-intensive task of manually measuring crop output and quality. How manual? Today, it involves farmers using calipers and weighing scales to measure and grade plants and crops. The farmers then send the data to seed companies to develop higher-yielding, more disease-resistant crops. But developing better seeds typically takes between six and eight years, due partly to this manual data collection process.

According to the company, Imago’s image detection and recognition technology can collect that data 75% faster than humans do today. Their computer vision system can also collect more precise and accurate data around the plants’ appearance and overall health than even experienced farmers could. They’ve used AI to gather and understand massive amounts of data about the quality parameters of crops.

Agritech researchers at other companies are also working on robots that can pack and sort fruit inside a factory, or even pick fruit off trees and bushes in the field. This object recognition technology helps growers overcome many challenges — including deciding whether fruit is ripe for picking, handling produce without damage, and accurately differentiating between and sorting different varieties, colors, and shapes of produce.

Computer vision AI is also driving major innovations in healthcare, especially in medical imaging. Stanford University announced last year that its researchers had created a deep neural network that was just as accurate at diagnosing whether skin images were benign lesions or malignant skin cancers as a panel of 21 board-certified dermatologists.

There’s also a trio of startups – Swift Medical, Tissue Analytics, and – using AI software to better identify skin wounds and track their healing progress. Using short videos shot by smartphones, their software can build precise 3D measurements of the rash or wound, reported MobiHealthNews in November. Not only is the technology much faster – and thus less expensive — than using nurses and doctors, it is also more standardized, and potentially more reliable.

How to Train Your Computer Vision System

There are many more cutting-edge companies applying computer vision across other industries. They range from insurance apps that can scan photos and judge the amount of damage in a car accident, to AI systems that can view satellite photographs of properties and their surrounding foliage to judge their wildfire risk, to vending machines that can recognize and reflect back your facial expressions for a more interactive customer experience.

There’s a huge difference between seeing and perceiving. The need for more perceptive software is driving the current advances in computer vision and innovations in machine learning models and neural networks. However, without the right training data, the smartest neural network will lack the nuanced understanding it needs to accurately recognize objects in the real world – or even make a simple judgment like differentiating between a ripe blueberry and an unripe one.

Using immense amounts of high-quality training data, a computer vision system can be taught how to accurately perceive objects for the task at hand. To train a computer vision solution, you need a large volume of images that are appropriately chosen, labeled, and categorized. And because computers still aren’t great at understanding all the context within an image or real-world situation, image annotation must still be done by humans. Image annotation supported by a skilled crowd of data annotators and tools like Appen’s provides rapid markup and labeling of images and videos — all the way down the individual pixel, if needed.

How to Get the Right Image Annotation Data for Your Solution

While there are open computer vision datasets for many types of images, companies that are building niche solutions often must create and annotate their own training data. Wound Care Analytics is partnering with an outpatient wound treatment provider to build up its library of annotated images of wounds in order to better train its system. Imago AI is building its own image database of seeds and resulting crop yields, tagged with data around location, weather, and other criteria to help farmers choose the best crops for their particular soil and climate.

For a company to build a comprehensive dataset of annotated images, it takes considerable time and resources. Outsourcing the data collection and annotation efforts can help companies scale quickly. For most AI and computer vision providers, choosing a service provider that can rapidly create high-quality annotated images — tuned to their specific needs — will allow their engineering and development teams to focus on their core IP and building better algorithms, rather than labeling data.

With more than 20 years of experience, Appen has expertise in collecting and annotating the necessary data to build deep learning systems and neural networks for computer vision. This high-quality image annotation data is tailored to the specific training needs of your computer vision or medical imaging AI system.


Contact us to learn more about human-annotated data solutions for machine learning and artificial intelligence.

Leave a Reply

Your email address will not be published. Required fields are marked *