CVPR 2019: Progress and Challenges in the Field of Computer Vision

Published on

August 2, 2019

Author

Authors

Appen

No items found.

The annual Computer Vision & Pattern Recognition conference is packed with insightful presentations and top industry experts. With nearly 10,000 attendees and 1,200 papers, CVPR 2019 in Long Beach, CA was no exception. In this post, we’ll take a brief look at some notable presentations delivered at the conference for solving numerous computer vision and pattern recognition challenges.

Data augmentation diversity for increased accuracy

Google Brain scientist Ekin Dogus Cubuk gave an interesting presentation focused on data augmentation as an underutilized tool in deep learning. While data augmentation techniques such as mixup, cutout, and geometric operations are widely used today, Dogus hypothesized that the right combination of these techniques may improve accuracy for some CV projects.When his team put this theory to the test, they identified some fundamental best practices for data augmentation, a significant one being diversity in your data augmentation policy. Instead of applying a single strategy such as Cutout to every single image and every single mini-batch, having hundreds of strategies and choosing one of them randomly for each image actually gives you a huge improvement in accuracy. Looking toward the future, Dogus says that they want to apply their augmentation findings to other domains beyond image classification, including video and object detection.

Data augmentation for medical CV projects

Another data augmentation-focused presentation was given by Amy Zhao, a graduate student at MIT in the final year of her PhD. Zhao’s paper Data Augmentation Using Learned Transformations for One Shot Medical Image Segmentation addresses some of the challenges data scientists face when applying deep learning technologies to medical data. This work explores segmentation strategies for situations where you only have one annotated example, a common scenario. Zhao recognizes the current need to often manually annotate – or at least fine tune – datasets for the level of accuracy required for medical applications. But she is optimistic that it will be possible to train machine learning models to mimic the complex variations for images like brain MRIs. She suggests the way to do this is by using unlabeled datasets (which are widely available in the case of MRIs) as examples that can be used to train a segmentation convolutional neural network (CNN) to leverage the power of deep learning.

AdaGraph: Unifying Predictive and Continuous Domain Adaptation Through Graphs

Another thought-provoking presentation at CVPR 2019 was given by Massimiliano Mancini, a PhD student at Sapienza University of Rome, in collaboration with Fondazione Bruno Kessler and Istituto Italiano di Tecnologia.Mancini’s work focuses on predictive domain adaptation, useful technique for projects where you are given several domains: one is annotated, but the others are not. For each of them, you are given an attribute. To solve this task, Mancini suggests that you must relate what the parameters of a domain are with respect to a specific attribute or metadata. There is an initial phase where you need to train parameters which are specific to a certain domain or attribute. While doing that, Mancini initializes what he calls a graph. There is a node for each of the domains and he connects the nodes of the graph with the edges, where the weight of the edge represents how much two domains are related. Since he didn’t have data for the target attribute, he initializes a virtual node. Mancini suggests that you can propagate the parameters of nearby nodes under the assumption that similar domains in the graph require similar parameters. This allows him to obtain models for target domains whether they are seen or not.In terms of results, Mancini told CVPR Daily that "Our graph is able to estimate target model parameters with pretty good results in our experiments. Receiving the target data and using them to refine our model, allows him to fill the remaining gap with the upper bound, a standard domain adaptation algorithm with target data available beforehand."

The wrap-up

There were, of course, too many impressive papers, presentations, and interviews at CVPR 2019 to detail here. Numerous enterprise sponsors once again featured some exciting research at the conference.

A Facebook team led by Zhenpei Yang authored an paper titled Extreme Relative Pose Estimation for RGB-D Scans via Scene Completion. This paper sets out to solve a fundamental problem in computer vision, robotics, and computer graphics: estimating the relative rigid pose between two RGB-D scans of the same underlying environment.
Google researchers were on hand to discuss the latest machine learning techniques applied to various areas of machine perception. Their team also shows demos of several recent efforts, including the technology behind predicting pedestrian motion, the Open Images V5 dataset, and much more.

Appen for computer vision and data annotation

Many machine learning projects don’t have the luxury of experimenting with various machine learning algorithms and strategies – they need to get their product or service to market before the competition! For the most accurate results, Appen provides a range of image annotation services.Appen can handle multi-phase annotation by building in logic and dependency trees into multiple (unlimited) rounds of annotation, reviewing images in more detail, and building out specific metadata about each object. We can also review for offensive content in images, annotate images to improve search functionality within a client’s image recognition software, and categorize images by quality to improve search content over time.

Appen for data augmentation pattern recognition

Appen provides data augmentation services to make sure you have the best training data for your computer vision projects such as image or video annotation. The Appen solution features several key process components to help ensure the highest level of data quality:

Data clustering/distribution analysis and visualization
Data abnormality detection
Data bias removal strategy
Data automatic augmentation strategy
Data labeling instruction recommendation

With comprehensive, easy-to-implement data annotation and project management services, Appen offers an end-to-end solution that can quickly provide the foundation you need to make your CV solutions as accurate as possible.—At Appen, we’ve helped leaders in machine learning and artificial intelligence scale their programs from proof of concept to production. Contact us to learn more.

CVPR 2019: Progress and Challenges in the Field of Computer Vision

Data augmentation diversity for increased accuracy

Data augmentation for medical CV projects

AdaGraph: Unifying Predictive and Continuous Domain Adaptation Through Graphs

The wrap-up

Appen for computer vision and data annotation

Appen for data augmentation pattern recognition

Related posts

What is Human-in-the-Loop Machine Learning?

Deciphering AI from Human Generated Text: The Behavioral Approach

Data Quality: The Better the Data, the Better the Model

Machine Vision vs. Computer Vision — What’s the Difference?