Four Tips to Pick Your Goldilocks Problem for AI

Why Selecting The Right Business Problem to Use AI to Solve is Essential

The Following is Adapted from Real World AI.

When your company is just starting to explore AI, picking the first problem to solve is just as important as coming up with the AI solution itself. Pick a problem that’s too big or difficult, and your AI project will fail. Or pick a problem that’s unimportant, and it won’t matter whether your AI project succeeds or not.

It’s the classic Goldilocks situation: you need a problem that is just right. If you can solve the first problem you attack and prove the impact AI can have, you’ll have a much easier time getting support and resources to tackle the next 10 problems.

You’ll likely identify a variety of potentially great first problems.

Four Tips To Discover Your Goldilocks Problem for AI

Goldilocks problem for AI

Here are the top four tips for selecting the right problem to tackle with AI from Real World AI.

#1: Start Small

The best Goldilocks problem for AI is small enough that you can solve it quickly.

Problems that involve classifying something into one of two buckets are great candidates. By way of contrast, problems that require resolving ambiguity are probably not great candidates. If two people might disagree on the right answer, you’ll have a much harder time showing that your model does the right thing most of the time.

Imagine if the problem you choose is classifying each incoming ticket into one of 100 categories. It would take a well-trained person weeks to learn the categories and provide enough examples to get it right; even then, other people might agree or disagree frequently. This isn’t a good Goldilocks problem.

As an example, the software company Autodesk was struggling with a lengthy help-ticket queue that was managed manually. An average support case took over a day to resolve. So, when Autodesk decided to bring AI into the mix, they narrowed in on one preeminent issue: improving the customer experience by reducing case resolution time.

Rather than building a model to help automate all the inquiries into their contact center, which spanned dozens of use cases and questions, they focused on solving a single, narrow problem that represented a huge percentage of incoming support tickets: password resets. All their model needed to do was determine whether a ticket was a password reset: yes or no. That was a perfect Goldilocks problem.

It’s tempting to want to throw AI at all your biggest problems right away, but by starting simple, you increase your chances of success.

#2: Go Where the Data Is

Another characteristic of a good Goldilocks problem for AI is a large bank of historical data obtained through past instances of solving that same problem.

Autodesk’s password-reset inquiries fit this bill: the company had a pool of past instances of password-reset inquiries and the corresponding answers from human agents correctly identifying the nature of the inquiry.

All past cases that have been classified into buckets become training data for your model. If you don’t have examples that have already been categorized, you might still have examples that you could spend time having humans go through and categorize now—time-consuming work, but of huge benefit to your project.

It’s important to ensure that you have not just quantity of data, but quality, to ensure accuracy and make sure you don’t introduce bias or unfairness unintentionally.

For example, let’s say you want to use AI for speech recognition in a call center that primarily serves Spanish speakers. You could have a significant amount of call center data, even a significant amount involving Spanish speakers, but if it’s the wrong accent, that data won’t be able to produce a good model.

Even if you identify a problem with a narrow scope and reasonably simple classifications, if you don’t have enough data—or the right kind of data—it won’t be a good Goldilocks problem.

#3: Deliver Quick Wins

The faster you can deliver wins, the faster you can prove the importance of AI. If you have a problem that could be solved in part or whole by an off-the-shelf model, that could be a great Goldilocks problem for AI.

An off-the-shelf model is one that someone else has already developed and is selling as a service. This means that the model comes pre-trained, and the data that it’s trained on matches the specific problem you’re solving. A common off-the-shelf model currently available, for example, is one that takes incoming customer requests and quickly recognizes what language the request is in.

If you can deliver value quickly, and you don’t need to build a custom machine learning model to do so, then great! Choose that. The business will be far more willing to tolerate the nine to twelve months it can take to get a more complex, custom model built, tested, and in production.

Additionally, off-the-shelf training datasets offer a quick, cost-effective alternative to collecting and annotating data from scratch and can be used even if you’re building your own model. High-quality datasets can be used as-is or customized for specific project types. Not only is it advantageous from a price and speed perspective, but growing requirements for data privacy and security from both customers and authorities can make it complicated to use data you have on hand.

You can see how MediaInterface used Appen’s off-the-shelf datasets to expand into a new market.

#4: Make an Impact

Although your Goldilocks problem should be small enough to be solved quickly, it should still be big enough to have a clear business impact.

Often, Goldilocks problems are linked to obvious things like revenue, customer net promoter score (NPS), or time value. It’s easy to see the value of a solution that measurably increases revenue or decreases costs. If your solution frees up people from performing a fairly mundane or tedious task that doesn’t give them a lot of satisfaction—like, for instance, sorting individual envelopes by reading zip codes over and over, it’ll be seen positively and reduce costs.

A good rule of thumb is to not only be clear on the business impact but be able to clearly measure and prove it. The Autodesk password reset, for instance, fit this goal perfectly. They were able to quantify in time, and customer satisfaction scored the benefit of more quickly solving password reset issues.

It’s also helpful for the first solution to be novel or innovative in some way, to grab even more attention. If non-AI teams get excited about what the AI team can do, the whole organization will start coming up with problems to solve and give support to the AI team who works on them.

AI is a Marathon, Not a Sprint

Building AI into your business doesn’t have to mean leveraging machine learning to solve every problem all at once. In fact, it shouldn’t. Adopting AI is a marathon, not a sprint, so you want to pace yourself.

It’s more important to pick the single right problem to start with and build momentum with its solution. Choosing the Goldilocks problem that hits the sweet spot of scale and impact with a manageable machine learning component is the most important thing you can do to set yourself up for success.

For more advice on picking your Goldilocks problem for AI, you can find Real World AI on Amazon.

Alyssa Rochwerger is a customer-driven product leader dedicated to building products that solve hard problems for real people. She delights in bringing products to market that make a positive impact for customers. Her experience in scaling products from concept to large-scale ROI has been proven at both startups and large enterprises alike. She has held numerous product leadership roles for machine learning organizations. She served as VP of product for Figure Eight (acquired by Appen), VP of AI and data at Appen, and director of product at IBM Watson. She recently left the space to pursue her dream of using technology to improve healthcare. Currently, she serves as director of product at Blue Shield of California, where she is happily surrounded by lots of data, many hard problems, and nothing but opportunities to make a positive impact. She is thrilled to pursue the mission of providing access to high-quality, affordable healthcare that is worthy of our families and friends. Alyssa was born and raised in San Francisco, California, and holds a BA in American studies from Trinity College. When she is not geeking out on data and technology, she can be found hiking, cooking, and dining at “off the beaten path” restaurants with her family.

Wilson Pang joined Appen in November 2018 as CTO and is responsible for the company’s products and technology. Wilson has over nineteen years’ experience in software engineering and data science. Prior to joining Appen, Wilson was chief data officer of Ctrip in China, the second-largest online travel agency company in the world, where he led data engineers, analysts, data product managers, and scientists to improve user experience and increase operational efficiency that grew the business. Before that, he was senior director of engineering at eBay in California and provided leadership in various domains, including data service and solutions, search science, marketing technology, and billing systems. He worked as an architect at IBM prior to eBay, building technology solutions for various clients. Wilson obtained his master’s and bachelor’s degrees in electrical engineering from Zhejiang University in China.

Website for deploying AI with world class training data
Language