Dialpad Creates Data That Powers ML Models for Human Conversation at Scale

Their transcription models leverage our platform for audio transcription and categorization, as well as verifying internal transcriptions and outputs of their models.

The Company

Dialpad improves conversations with data. They collect telephonic audio, transcribe those dialogs with in-house speech recognition models, and use natural language processing algorithms to comprehend every conversation. They use this universe of one-on-one conversation to identify what each rep–and the company at large–is doing well and what they aren’t, all with the goal of making every call a success.

 

The Challenge

Every company is different. Each has its own vernacular, its own target market, and its own goals for the conversations they have with their customers. That means that every one of Dialpad’s clients needs a robust set of unique training data to make the solution work as well as it possibly can and each training set informs a model that makes sense of that specific company’s conversation.

Creating these individual sets is a core part of Dialpad’s offering.

After all, they’re the fuel that drives the accuracy of their solution. Dialpad had worked with a competitor of Figure Eight for six months, but were having trouble reaching an accuracy threshold to make their models a success. Put simply, the labels they received weren’t good. They topped out at about 70% accuracy.

Dialpad needed a change, so they turned to Appen.

 

The Solution

Humans-in-the-loop for intelligent conversational data

Since Dialpad’s solution is built on their recognition, transcription, and comprehension models, the accuracy of those models is incredibly important. After all, almost every engineer knows the old adage, “garbage in, garbage out.” Less colloquially, it boils down to this: bad labels mean bad training data and bad training data means bad models.

Dialpad runs Appen jobs to create the data that drives these models. Primarily, they’re driving their transcription models through the platform, doing transcription from audio, categorizing audio into key moments and other important data classifications, as well as verifying internal transcriptions and outputs of their models. They even use our geolocation tools to make sure British contributors label idiomatic speech from the U.K.

 

The Result

It took just a couple weeks for the change to bear fruit for Dialpad and to create the transcription and NLP training data they needed to make their models a success

“When we changed to Appen, within a few weeks, we saw that labeler accuracy go up to 88% and it stayed in the high 80s and 90s for us ever since, even across a large diversity of models. That’s been a really, really big win.”

– Etienne Manderscheid,
Head of Data Science, Co-Founder TalkIQ (acquired by Dialpad)

TalkIQ has been acquired by DialPad but they continue to scale their operation with custom training data. They plan on aiming for labeling training datasets for hundreds of customer organizations, as well as generating and validating paraphrases in addition to the key moments at which they already excel. Our platform will be able to scale with them, maintaining the accuracy they’ve gotten used to, every step of the way.