The following is adapted from Real World AI.
When I worked in IBM’s Watson division, my team’s first demo quickly turned into a disaster, due to scaling issues.
We were creating a visual-recognition application programming interface (API). Our first demo was a simple website that allowed users to drag-and-drop images, press a button, and be presented with a list of tags describing what was in the image.
The website was intended as a small-scale sales tool, to show what the API could do. We launched the demo on a Wednesday and sent the link to a few interested salespeople.
By Saturday morning, my phone started blowing up. Somehow, our little demo had made its way to the front page of Reddit. Suddenly, our webpage, built by an intern, was receiving thousands of visitors every minute, instead of the few we’d imagined.
We hadn’t set it up for scale, so the system attempted to automatically scale up to handle the extra traffic—and didn’t handle it particularly well. This somehow exposed some underlying bugs in the supporting system architecture the demo was built on top of, which ended up bringing down an entire IBM data center in the southern US for about 20 minutes.
Fortunately, IBM has lots and lots of backup procedures in place, so we didn’t end up doing any actual damage to anyone, but the experience still taught me a valuable lesson: when building AI, you absolutely need to be prepared to scale.
You hopefully won’t face unexpected scaling like this, but even so, the purpose of any pilot AI model isn’t just to create a successful pilot and stop there. The long-term goal is to implement the AI solution in production, which means you must keep scaling in mind.
As you build your AI pilot, you must constantly plan ahead. Every decision you make must work not only for the pilot, but for the future model you’ll implement at scale in production.
Try to get clarity on what your production pain points will be. These can affect the framework, libraries, or language you’ll use in production; how and where you’ll deploy your model; and the ways you’ll need to monitor it. Becoming aware of production pain points can help you avoid making choices that would prohibit scaling up.
One major e-commerce company embarked on a natural language processing (NLP) project to perform sentiment analysis on their chatbot logs. Their goal was to follow up personally with customers that had a negative experience. To get started, they prototyped everything in Python (a programming language commonly used by data scientists), which has a host of NLP libraries available.
But when it came time to deploy the model to production, they discovered they’d have to port the entire model to Scala (a programming language used by software engineers to build highly scalable software) so that it could run in the Java environment they had available.
For every decision you make in the pilot, ask: Will I also be able to do this at production scale? Will I be able to integrate this into a production environment just as well? If your pilot takes shortcuts by relying on some unique feature of the smaller scope that would be cost-prohibitive at scale, or needs data that doesn’t exist for the entire production scope, or simply isn’t technically possible, then your project will fail even if your pilot succeeds.
Consider the Costs of Scaling
Hopefully, if you’ve planned ahead well, all you’ll have to do to scale up is spend more on resources. However, the cost of scaling may be more than you expect. People often assume that once a model is built, it’ll scale out in production with only marginal increases in costs. Usually, these people are disappointed.
AI solutions aren’t like a SaaS (software as a service) business, where resource consumption increases only marginally as new customers are brought on. AI models have to be supplied with new data continuously in order to work and adapt to inevitable changes in the real world. Depending on the problem you’re solving, your model may need to be retrained frequently or need to be fed data for every new customer. It’s even possible, although not common, that your AI costs will scale closer to linearly with usage.
Even if you don’t have high costs for your pilot, it should still help you predict what your production costs will be. This can let you budget effectively and also maximize the efficiency of your platform spend on AWS or Google or Azure—if you buy GPUs for a full year, they’re much cheaper than allocating them on demand. The pilot process, if correctly structured, will let you capitalize on those savings.
In the end, if done right, the costs should be worth it. Think about it; without the AI model, you’d have to scale all your business operations manually to do the same things you’re automating. So even with that extra investment, as long as you’re keeping the scaling costs in mind as you build your pilot, the AI solution will be cheaper and will get you better results than doing the tasks manually.
Expect to Pivot
No matter how hard you try to keep your pilot consistent with your expectations for the production system, it will always be different. Sometimes you can predict what will be different, and sometimes you can’t. At a minimum, the scale in production will be greater, and the data different; both of these will materially change your outcomes in unanticipated ways.
So plan ahead as much as possible, avoiding decisions that would prohibit scaling while considering the costs of scaling, but also be prepared to pivot as you go.
For more advice on scaling AI solutions, you can find Real World AI on Amazon.
Alyssa Rochwerger is a customer-driven product leader dedicated to building products that solve hard problems for real people. She delights in bringing products to market that make a positive impact for customers. Her experience in scaling products from concept to large-scale ROI has been proven at both startups and large enterprises alike. She has held numerous product leadership roles for machine learning organizations. She served as VP of product for Figure Eight (acquired by Appen), VP of AI and data at Appen, and director of product at IBM Watson. She recently left the space to pursue her dream of using technology to improve healthcare. Currently, she serves as director of product at Blue Shield of California, where she is happily surrounded by lots of data, many hard problems, and nothing but opportunities to make a positive impact. She is thrilled to pursue the mission of providing access to high-quality, affordable healthcare that is worthy of our families and friends. Alyssa was born and raised in San Francisco, California, and holds a BA in American studies from Trinity College. When she is not geeking out on data and technology, she can be found hiking, cooking, and dining at “off the beaten path” restaurants with her family.
Wilson Pang joined Appen in November 2018 as CTO and is responsible for the company’s products and technology. Wilson has over nineteen years’ experience in software engineering and data science. Prior to joining Appen, Wilson was chief data officer of Ctrip in China, the second-largest online travel agency company in the world, where he led data engineers, analysts, data product managers, and scientists to improve user experience and increase operational efficiency that grew the business. Before that, he was senior director of engineering at eBay in California and provided leadership in various domains, including data service and solutions, search science, marketing technology, and billing systems. He worked as an architect at IBM prior to eBay, building technology solutions for various clients. Wilson obtained his master’s and bachelor’s degrees in electrical engineering from Zhejiang University in China.