The Truth About End-to-End and Unsupervised Learning

There’s a common perception that creating an AI agent can be a simple process of end-to-end learning, where labelled data is fed to a model, and spat out neatly as a conversational agent.

In conversational AI, end-to-end learning typically involves feeding a large number of labelled conversations into a neural network model so that it can try to learn how to mimic the behavior it sees in those conversations.

There is also a perception that unsupervised learning – where a model tries to find insights in a large amount of unlabelled data – is a viable approach for building commercial conversational AI solutions.

In this post, we will explore the issues with end-to-end and unsupervised learning, and explain why these are not recommended for the development of AI agents that will interact with customers on behalf of a company.

The problem with end-to-end learning

While end-to-end learning can be used to create AI agents, this approach is not recommended for commercial development. This is because most (if not all) corporations are unable to provide the quantity and quality of data needed to yield a commercially viable AI agent.

In order for end-to-end learning to really work, the data that the model is trained on needs to accurately reflect all of the informational inputs that an agent sees in real life, as well as everything the agent does (such as checking other systems).

In other words, the transcripts of conversations would need to cover every single possible input that was – or could be – used to complete each and every transaction. This is not just about the customer’s responses, but the preferences they have chosen on an IVR as well as every single action an agent has taken in the conversation (e.g. such as looking up an order number, searching a database for a product etc.). Unsurprisingly, collecting this sort of data can be incredibly difficult.

End-to-end learning in the wild

End-to-end learning is a very popular area of research, but these approaches are not controlled enough to train AI agents that will speak to customers on behalf of a corporation.

If we were to employ end-to-end learning with a client’s data, we would not be able to guarantee reasonable or even appropriate responses from the resulting AI agent. We would have no final control over how the system behaves, and no way of correcting behaviour we don’t want it to exhibit.

For example, let’s say that – in your transcripts – whenever a customer asks to know the agent’s name, the agent tells them their name e.g. “I’m Alice”, “I’m Bob”

How might a system trained end-to-end on these transcripts behave? The most likely outcome is it comes up with a random name every time, or just plain gibberish.

Here’s another example: Imagine if the transcripts collected were from a period of time where there is a special 2-for-1 offer for a product the client is selling. The system will not know when this offer is stopped, and continue telling customers (incorrectly) that they are eligible for this offer.

Worse is that there would be no one reliable way of correcting the model because the only way it can learn is through collecting more data or editing the data you already have.

Training on the behaviour of human agents poses another problem: you want your conversational AI to act differently from your human agents. There are some behaviours – such as being able to hand off the call to a human if the AI can’t support a certain intent, or becomes confused – that historical data does not contain.

Unsupervised learning

Unsupervised learning is the process whereby the model tries to learn something about data that doesn’t have any explicit labels applied to it.

Instead of labels, the model will assume certain properties about the data based on correlations. For example “sentences that have lots of words in common are more likely to mean the same thing than those that don’t.” In this case, the model would use this assumption to try and group sentences by their meaning, and thus find the different intents a user might say.

Not only are these methods (called clustering analysis) very temperamental, they usually tell you something you already know about the data: the percentage of conversations about different topics (e.g. 25% of conversations are about refunds, 30% about cancelling subscriptions etc.).

Unsupervised learning is a very immature area of machine learning research and is still a long way off being accurate enough to be reliably used in real-life applications.

Pre-training, fine-tuning and domain-specific design

At PolyAI, we take a more careful approach to training, because our clients need to be in control of their brands. It’s okay if a Google experiment goes awry because it’s seen as research. Most corporations don’t have that luxury, and even if they did, they would not gamble their customer relationships on it.

We don’t need to train our AI agents on vast client data. Our Encoder Model has already been end-to-end trained on billions of conversational examples, so it has a really good understanding of how language works.

We build AI agents on top of the Encoder model, so we only need small amounts of client-specific data to fine-tune the model through transfer learning to get accurate and reliable results.

Rather than requesting hundreds of thousands of historic call transcriptions, we use FAQs, training manuals, process books and knowledge bases to give our AI agents the answers to the questions customers might ask. We are then able to program the agent’s behaviour explicitly, giving us complete control.

Conclusion

End-to-End learning is not impossible, but without vast, contextually-complete data, this approach can not be relied on by corporations looking to implement conversational AI agents to assist their customers.

While unsupervised learning can yield interesting results from some data, it is unlikely to provide anything other than what is already known within a contact center or customer service team.

Companies looking to create domain-specific AI customer service agents will benefit most from a blended development approach where models are trained manually on small amounts of context-rich data, and artificial intelligence is used to develop smart agent behaviour.