Architecture of a Conversational AI system
Exploring the 5 essential building blocks of a Conversational AI system
Photo by Jason Leung on Unsplash
Chatbots are everywhere now! By chatbots, I usually talk about all conversational AI bots — be it actions/skills on smart speakers, voice bots on the phone, chatbots on messaging apps, or assistants on the web chat.
All of them have the same underlying purpose — to do as a human agent would do and allow users to self-serve using a natural and intuitive interface — natural language conversation.
If you breakdown the design of conversational AI experience into parts, you will see at least five parts — User Interface, AI technology, Conversation design, Backend integration, and Analytics.
If you are a big organisation, you may have separate teams for each of these areas. However, these components need to be in sync and work with a singular purpose in mind in order to create a great conversational experience.
“The whole is greater than the sum of its parts.” — Aristotle
User Interface is the portal to conversational experience. There are many now. Chat widgets you see on websites, messaging apps like Facebook Messenger, Slack, etc, smart speakers like Google Home and Amazon Echo, traditional channels like telephony, SMS, etc are all conversational channels and offer their own unique user interfaces.
UI design is the process of designing the actual interface — how should the web chat widget look like, what functionalities does it need to have, CSS styling, look and feel of speech bubbles, other elements like cards, carousels, designing new cards tailored to your domain, etc.
Design UI elements for an overall intuitive experience. There are many principles that we can use to design and deliver a great UI — Gestalt principles to design visual elements, Shneiderman’s Golder rules for functional UI design, Hick’s law for better UX.
Being able to design UI gives you more control over the overall experience, but it is also too much responsibility. If human agents act as a backup team, your UI must be robust enough to handle both traffic to human agents as well as to the bot. In case voice UIs like on telephony, UI design would involve choosing the voice of the agent (male or female/accent, etc), turn taking rules (push to talk, always open, etc), barge-in rules, channel noise, etc.
It may be the case that UI already exists and the rules of the game have just been handed over to you. For instance, building an action for Google Home means the assistant you build simply needs to adhere to the standards of Action design. You don’t get to do any UI design.
That’s ok. But do pay attention to the nuances of the new UI. What does it afford? How different is it from say telephony that also supports natural human-human speech? Understanding the UI design and its limitations help design the other components of the conversational experience.
AI tech is the central component in the design of a Conversational AI solution. The UI module usually connects to the AI module in the backend. This is related to everything from designing the necessary technology solutions that will make the system recognise the user’s input utterances, understand their intent in the given context, take action and appropriately respond.
This also includes the technology required to maintain conversational context so that if the conversation derails into a unhappy path, the AI assistant or the user or both can repair and bring it back on track.
There are many out of the box solutions for conversational AI. These use machine learning to map user utterances to intent and use rule based approach for dialogue management (e.g. DialogFlow, Watson, Luis, Lex, Rasa, etc). In addition to these, the understanding power of the assistant can be enhanced by using other NLP methods and machine learning models.
For instance, the context of the conversation can be enriched by using sentiment/emotion analysis models to recognise the emotional state of the user during the conversation. Deep learning approaches like transformers can be used to fine-tune pre-trained models to enhance contextual understanding. Response models are also evolving. I have explored these in a separate post — 5 Models for Conversational AI.
Designing solutions that use of these models, orchestrate between them optimally and manage interaction with the user is the job of the AI designer/architect. In addition, these solutions need also be scalable, robust, resilient and secure.
If AI designers design the engine, conversation designers design and develop the fuel that will run the engine. Conversation design deals with the actual conversational journey between the user and the chatbot. How does the conversation flow? What patterns will they follow? What will happen if the conversation breaks down?
Design these patterns, exception rules, and elements of interaction are part of scripts design. They also design the elements of understanding — intents, entities, and other elements of domain ontology and conversational framework needed to the AI modules require to drive the conversation. In bigger teams, understanding and management parts will be split between data scientists and conversation designers respectively.
Conversation design is a very creative process. User experience design is a established field of study that can provide us with great insights to develop a great experience.
Understanding customer behaviour can also help us build a better UX. Michelle Parayil neatly has summed up the different roles conversation designers play in delivering a great conversational experience. Conversation Design Institute (formerly Robocopy) have identified a codified process one can follow to deliver an engaging conversational script.
Conversation designers could use a number of tools to support their process. Conversation Driven Development, Wizard-of-Oz, Chatbot Design Canvas are some of the tools that can help. Mockup tools like BotMock and BotSociety can be used to build quick mockups of new conversational journeys.
Tools like Botium and QBox.ai can be used to test trained models for accuracy and coverage. If custom models are used to build enhanced understanding of context, user’s goal, emotions, etc, appropriate ModelOps process need to be followed. At the end of the day, the aim here is to deliver an experience that transcends the duality of dialogue into what I call the Conversational Singularity.
A Conversational AI assistant is of not much use to a business if it cannot connect and interact with existing IT systems. Depending on the conversational journeys supported, the assistant will need to integrate with a backend system. For instance, if the conversational journeys support marketing of products/services, the assistant may need to integrate with CRM systems (e.g. Salesforce, Hubspot, etc).
If the journeys are about after-sales support, then it needs to integrate with customer support systems to create and query support tickets and CMS to get appropriate content to help the user.
In addition to these, it is almost a necessity to create a support team — a team of human agents — to take over conversations that are too complex for the AI assistant to handle. Such an arrangement requires backend integration with livechat platforms too. Making sure that the systems return informative feedback can help the assistant be more helpful.
For instance, if the backend system returns a error message, it would be helpful to the user if the assistant can translate it to suggest an alternative action that the user can take. In summary, well-designed backend integrations make the AI assistant more knowledgeable and capable.
Finally, the last part of the design puzzle is the analytics solution. This includes designing solutions to log conversations, extracting insights, visualising the results, monitoring models, resampling for retraining, etc. Conversational AI solutions are relatively new. And for that reason they require constant monitoring and calibration.
How would you know if your AI assistant is doing its job? Designing an analytics solution becomes essential to create a feedback loop to make your AI powered assistant, a learning system. Many out of the box solutions are available — BotAnalytics, Dashbot.io, Chatbase, etc. Or you could build one using tools like Tableau, IBM Cognos, etc.
These solutions provide invaluable insights into the performance of the assistant. What sort of utterances does and does not the assistant understand? Where do the users drop off and why? How long and effortful were the conversations? Was failure handled gracefully? Were users satisfied? Did they have a good experience that they would share with their friends?
These are some of the questions that the analytics module need to answer. These metrics will serve as feedback for the team to improve and optimize the assistant’s performance. Remember when using machine learning, the models will be susceptible to model drift, which is the phenomenon of the models getting outdated overtime, as users move on to different conversation topics and behaviour. This means the models need to be retrained periodically based on the insights generated by the analytics module.
There you go! The 5 essential building blocks to build a great conversational assistant — User Interface, AI tech, Conversation design, Backend integrations and Analytics. You may not build them all as most of these can be picked from off the shelf these days. But we need to understand them well and make sure all these blocks work in synergy to deliver a conversational experience that is useful, delightful and memorable.
Hope you enjoyed this write up. This article was originally published in Analytics Vidhya. You can find more of my articles on Medium. Please do read and share your comments. Have a great day!