The DNA of an Intelligent Virtual Agent
Intelligent virtual agents are zeros and ones that make up a human’s approach to simulate humanity as best as possible. That’s the simplest way to describe them to any time traveler from an age before 1956 who lands in 2021 and is trying to figure out what kind of accent they hear on the other side of the weird tiny computer they’re holding up to their ear as they call into customer service to inquire about their order of Heelys (gasp, shoes, AND skates! At the same time?!)
But anyone from this time knows that definition doesn’t do IVAs any justice. The details matter, as they do in most cases when teaching the intricacies of modern technology. While artificial intelligence isn’t human, its unique DNA sequentially makes up a superb customer experience in a contact center solution. So, let’s plunge deep into the specifics of the DNA that makes an intelligent virtual agent.
Recognition and Cognition
Conversational AI begins like any human-to-human conversation. There’s hearing, and then there is understanding. Two different functions work to bring an experience. In AI, we refer to the two functions as recognition and cognition. Recognition is ASR (automatic speech recognition) which converts soundwaves to words. Then you have to make sense of those words. With machine learning, an NLU (natural language understanding) engine pieces together the words to extract the intent. Great NLU techniques exist that use pattern matching, word combinations heard against intents expected in such scenarios to determine user intent accurately.
Automatic Speech Recognition
ASR only has one purpose — to take the human voice as input and convert the syllable-by-syllable utterances into waveforms. They are then patched together to form words by running statistical models against every word in the dictionary. But that’s not enough. Speech-to-text is never 100% accurate. What elevates the conversational AI experience is the NLU engine.
Natural Language Understanding Engine
Most voice providers augment their conversational AI through contextual NLU only. Contextual NLU is looking at the word in the context of the sentence to determine if it is, for example, a number 4 or the word f-o-r. This technology is something you use every day when using speech-to-text on your phone. In real-time, you see a word change based on the context around it.
But advanced NLU techniques advance the customer experience. As mentioned before, pattern matching goes beyond the context of a sentence. It predicts expected intents. An NLU engine tailored to listen for only intents specific to an industry, custom for a client, makes all the difference in the world. If a speech-to-text engine delivers an output that doesn’t match an intent, this NLU approach can kick in to run hypotheses and confidence scores against the acoustic models of the intents it’s expecting to hear.
Team of Experts
Having the technology available to you isn’t enough. Humans are required. Just like live beings require care and feeding, intelligent virtual agents require the same consideration to continue operating with high accuracy as the world, and people evolve. Greater containment of calls is only possible if the work is put in over time. Much like we must put in the time to get better at playing guitar or painting, IVAs must be tuned and optimized to become better, too.
With a team of experts on hand, conversational AI gets constant care in the form of enhancements, upgrades, training, tuning. All that leads to reports and analytics that tell the experts how and when to optimize. It takes a village to raise a child, and this mentality applies to conversational AI in the same way.
Human-Centered Design
A few months back, I wrote about human-centered design being the heart of an intelligent virtual agent. That remains true as artificial intelligence is designed to do as humans do. That involves putting human elements into them like personalizing their greeting to the caller’s name to build a connection right off the bat, mimicking a more human voice vs. a robotic-sounding one, asking the right questions — all with empathy at the forefront of the design. This takes a customer down from the beginning point of tension to resolve any issue quickly.
Guardrails
A big part of designing productive IVAs is using data to establish handling rules on when a call should be with a virtual agent and when it should be transferred to a live agent. Not all calls are meant to be automated. Conversational AI isn’t here to replace humans, it’s here to aid them. By using guardrails, business rules from your data to identify all exclusions when a call goes outside the automated path and should be transferred, you’re ensuring call deflection is only in the needed cases established.
Developing conversational AI and, finally, an intelligent virtual agent to automate many calls, repetitive or otherwise, is a complicated but doable process so long as all the parts that make a successful IVA are in place. So, when a time traveler asks you, “Where is this accent from?” Now you can explain in intricate, excoriating detail just who exactly they’re talking to.