With the proliferation of mobile devices and increasing consumer expectations to stay connected regardless of their environment, automotive makers are under growing pressure to deliver on the promise of the connected car. However, according to J.D. Power, voice recognition in cars continues to be the leading complaint among new vehicle owners.¹
“There is a need for improvement by OEMs to ensure factors such as accuracy, natural language capabilities and unique user profiling are implemented…”
Diane O’Neill, Director at UXIP says, “There is a need for improvement by OEMs to ensure factors such as accuracy, natural language capabilities and unique user profiling are implemented to encourage both consumer use and satisfaction of in-vehicle speech recognition systems in the future.”²
Global automotive manufacturers face the added complexity of localizing their in-car systems for multiple languages. This requires large data collections which are very difficult for in-house engineering teams to manage. Speech needs to be collected in various driving environments so that the in-car system knows how to recognize language in multiple driving scenarios that account for weather conditions, types of roads, and so on. Since their engineers don’t have a language background — which is critical to speech data collections— OEMs may outsource this effort to skilled firms with a deep linguistic background.
¹ 2018 JD Power Multimedia Quality and Satisfaction Study
As an experienced solution provider for the automotive industry, Appen offers a full-service approach for localization, data collection, in-car testing and validation, and linguistic consulting. Our experienced project managers, who have years of experience working in the automotive industry, work directly with the OEM’s engineering team to develop and implement a program that provides high-quality training data for all target languages. They define the program, localize commands, prompts & display text, recruit native speakers according to the OEM’s target demographics, record utterances in various environments, manage the transcription of the data, and review it for accuracy. The audio and metadata are then packaged according to the OEM’s requirements. This data is used to ensure that the in-car system understands what the driver is saying. Appen also provides TTS evaluation services, where reports are produced based on native speaker feedback to help improve the accuracy and naturalness of the synthesized voice that responds to the driver’s commands.
An important aspect of Appen’s solution is its testing facility in Detroit. This facility allows our team to work closely with our clients to test various in-car setups. Our professional recording booth also allows us to collect utterances in simulated driving environments, which is critical to developing a high-quality voice recognition system.
Another key component that Appen provides as part of the engagement with the OEM is linguistic consulting. As the Appen team has deep linguistic experience, they not only ensure that translations are consistent, but they advise engineers on the nuances of language that other firms may not have the ability to do. And with their knowledge of automotive terminology and translations specific to automotive use cases, they provide unique value to the client to support its large-scale localization efforts.
Working with Appen allows the OEM’s engineers to focus on the core development work needed to develop leading in-car systems. Our years of experience in the automotive industry have allowed us to quickly engage with the OEM’s team to develop robust data collection and validation programs that consistently meet their needs, leading to a long-standing relationship. With Appen’s comprehensive solution offering from project development to speaker recruitment to data validation, the OEM has expanded into new markets with the confidence of knowing that its voice recognition system has been trained by high-quality data backed by extensive linguistic capabilities.
A leading global OEM has worked with Appen for over 10 years to develop the data needed to train its voice recognition systems for more than 20 languages.