⚠️ Unsupported Browser

Your browser is not supported.

The latest version of Safari, Chrome, Firefox, Internet Explorer or Microsoft Edge is required to use this website.

Click the button below to update and we look forward to seeing you soon.

Update now

How Customer Service Voice Assistants Handle Different Languages and Accents

Michael Chen
7 Jan 2021 - 5 minutes read

Let’s face it. Understanding your customers isn’t always easy. They tell stories, go off track and they often can’t think of a good way to explain an issue to you. And those are just the customers that speak the same language as you.

Even for humans, it can be really difficult to understand people with strong or unfamiliar accents. Call center workers receive accent training to understand 35 different variations of English alone

And what about communicating in different languages? For multinational organisations, speaking to customers in different languages means going the extra mile to hire multilingual staff, and making sure that calls actually go through to the agent who can speak the caller’s language. 

The same phrase spoken in different languages

Organisations in English-speaking countries have an inherent advantage. English was very likely the first language used on the internet, which means more training data which leads to more accurate predictions by software. All this means that machines can ‘understand’ better in English. 

But even in English, accents remain a challenge. Speech recognition has advanced leaps and bounds but it’s not perfect. The exact same conversation with a New Yorker versus an Australian versus a Serbian speaker of US English will all leave a slightly different text transcript, which is what trips up most voice assistants today. This leads to the much dreaded “sorry, can you repeat?” loop for even native English speakers.

For organisations in non-English-speaking countries or those that serve multicultural customers, the challenge is two-fold. Firstly, speech recognition solutions are not as accurate in other languages, immediately making things more difficult. Secondly, less training data in a particular language acts as a barrier to better accuracy of conversational models in that language. This is why it is much harder to build great conversational experiences for customers in say, Italian, Latvian, or Singlish than it is in English.

But it’s not hopeless. Now might be a good time to mention that PolyAI stands for Polyglot AI (Polyglot: speaking or using several different languages). From day one, our aspiration has been to help enterprises create compelling new voice self-service experiences, regardless of accent or language. We are purpose-built for multilingual voice applications, which means we do a few things differently…

Speech recognition optimisation

Some speech recognition solutions are better at understanding particular languages and accents than others. For example, the best solution for Polish may be different to the best solution for Thai. 

Flexibility here is key as speech recognition solutions continue to improve, which is why we test different speech recognition providers to find the best one for each project. On top of this, we use machine learning to add an additional layer of optimisation to each stage of a conversation. This ensures the most accurate transcriptions regardless of call quality or accent. 

Think of this as the technology equivalent of accent training, applied phrase by phrase.

Pre-trained speech encoder in multiple languages

Our state-of-the-art machine learning model is pre-trained on over a billion English conversations, which gives us a world-class foundation for natural language understanding. 

We’re constantly building upon this foundation by pre-training our model in over 15 other European and non-European languages. Instead of building new models for different languages, we incorporate new languages on-demand in a matter of weeks. This enables us to achieve the fastest time to market for voice assistants that are truly capable of conversing with real customers in multiple languages.

List of languages that PolyAI's voice assistants are pre-trained in so far

PolyAI voice assistants are pre-trained in multiple languages

A multilingual value extractor

Our proprietary value extractor, ConVEx (see here), is also trained in multiple languages. 

This means that PolyAI voice assistants are able to accurately take down valuable information, such as names and addresses, in any language. 

Our research has shown that our multilingual approach outperforms monolingual models. For example, by teaching our model German on top of its English foundation, our voice assistants are better at identifying information given by callers compared with a model trained just on English as well as a model trained just on German. This mirrors similar findings by other leading tech companies like Facebook

All this is to say, our voice assistants are better at collecting information from callers in different languages, making them uniquely suited for self-service experiences.

The future of multilingual customer service

Whether you’re an organisation trying to boost self-service for English-speaking customers or non-English-speaking customers, we see proof-of-concepts falling flat with real customers due to common issues all relating back to these three elements: 

  1. The ability for speech recognition to deal with accents and imperfect signals; 
  2. Robustness of speech encoders in each language; and 
  3. Accuracy of value extraction when dealing with natural and informal ways of speaking.

These barriers are not insurmountable; the technology is available today to build multilingual self-service experiences, but it does require an extra level of craftsmanship. Look for companies like PolyAI who have both the proprietary technology as well as the research expertise to help you create new self-service experiences for your customers in any language.

Get in Touch

Learn more about voice-based conversational AI, request a demo or find out how PolyAI can help.

The latest from PolyAI

Conversational ID&V vs Voice Biometrics:
Conversational ID&V vs Voice Biometrics: A New Approach to Authenticating...

Michael Chen | January 2021

PolyAI Joins NVIDIA Inception
PolyAI Joins NVIDIA Inception

Michael Chen | January 2021

Busy call center
7 Reasons Why Voice Assistants Are Crucial for Handling Holiday Peaks in Your...

Laura Grainger | December 2020