“Appen embodies the artisan spirit in China. We pay attention to every little detail.”
Danny Duan, VP of China Business Development, joined Appen in January 2017 as we opened our office in Beijing. We recently talked to Danny about AI trends in China, the reasons for expanding into the Chinese market, the growing artificial intelligence market in China and the value we’re delivering to local clients.
Q: What is the AI and machine learning market like in China?
China is very active in terms of AI and machine learning. I’d say AI is probably one of the handful of technology domains where China is not lagging behind Western countries.
There are traditional speech and computer vision technology players like iFlytek and Baidu. But there are also some companies not directly related to AI, such as Alibaba, Tencent, and DiDi, the Chinese version of Uber, that are pushing AI forward. A lot of companies are investing money to develop AI applications and technology. Some are doing this to gain a competitive edge over other technology leaders. Some want to integrate AI into their own products for specific domains. Some want to have a share of new market sectors created by the rise of AI. All of this, coupled with the vast supply of venture capital, leads to a very active AI market.
This AI market has fostered a large amount of startups in terms of software and mobile applications. So from the giants to the small niche providers in a specific vertical domain, there are a huge number of industry players. On the other hand, technology in today’s AI market hardly a competitive advantage anymore. Most algorithms and models at the core of AI technology are now open-sourced. The true differentiation lies in data – large amounts of high quality, domain specific data. That creates a lot of opportunities for data service providers.
Q: What were the key reasons for Appen expanding into China?
To be competitive in the Chinese market, I think it’s necessary to build a presence and develop local capability. The Chinese technology market – Chinese vendors doing Chinese languages for Chinese end users – is huge and growing. We want to bring our strengths to the local market to help these companies develop AI products and technology.
As China becomes the second largest economy in the world, Chinese companies are increasingly positioning their products and services for the global market. This is where Appen can play a huge role. With coverage in 130+ countries and 180+ languages, we can provide data and linguistic support to all the major markets worldwide, a claim few companies can make.
Having an office based in China will help Appen to be more competitive globally, too. China is a Tier 1 or 2 opportunity in terms of market importance for Appen’s clientele based in US and Europe. It means we can better serve our existing clients by providing stronger support for Chinese languages and dialects as well as lowering overhead cost by leveraging China’s cost effective labor pool.
Q: Can you give an example of value we’ve delivered to a Chinese client?
We’ve been working with a startup in China. They’ve developed their own companion robot. A lot of AI startups lack the technology or funding or expertise, so they choose to integrate existing speech technology packages into their products. A smaller amount of companies have the determination to develop their own speech technology package. It’s better to develop your own so you can create something unique – but it’s not easy.
So our client developed a Chinese synthetic voice engine – but they know the voice isn’t perfect. You’d imagine there would be plenty of Chinese service providers that could help them improve the quality. The reality is that other suppliers can point out the issues — i.e. the tone is unnatural, the word boundary is incorrect – but they can’t identify the root causes or provide solutions to correct the issues. They lack the linguistic knowledge and experience to go this deep.
That’s where Appen has stepped in. Because of our deep linguistic expertise and experience helping various clients develop synthetic voice engines, we are able to analyze the issue and determine whether they lack training data, they have improper coverage of the training data, the data is poor quality, or a combination of these factors, and make improvements by targeted augmentation and/or cleansing of the training data set.
Q: What advice do you give to clients when you consult with them about their data needs?
My advice is that in a perfect world, the logic is straightforward—more data is always better, and higher quality data is ideal. In the real world, though, the trick is to balance quality, quantity and cost to deliver the best value to the client’s finite budget.