
Using PhoneMyBot to add voice to a chatbot
> BEST PRACTICES <
PhoneMyBot is Interactive Media’s service that allows chatbots – conversational AI applications that use text or web channels to communicate with users – to add the voice channel and have voice conversations with people quickly and easily. This could be over the phone or other voice networks. PhoneMyBot takes care of converting voice into text and text into voice, with a number of features to make this process easier and more pleasant for users, but chatbots should still be mindful of the ways voice conversations differ from text-based ones and adjust their side for a better conversational voice experience.
This document provides a number of tips to refine chatbots dialog to better adapt to voice.

1. START BEING SURE OF YOURSELF
One of the reasons people install voicebots is to lighten the load on human agent who receive telephone calls from customers. There is a good chance that the voicebot will be able to help servicing the most common tasks. A call should be forwarded to a human agent only if the voicebot cannot service the call. So, don’t start by telling a user that they can talk with a human by saying “human agent” at all times: only do that when it’s clear that the voicebot will not be able to help. Otherwise, that’s exactly what most people will do: saying “human agent“ immediately.


2. BE AS BRIEF AS POSSIBLE
The average reader can read about 300 written words per minute, but talking is slower. Normally podcast readers speak 150-160 words per minute, as this is the optimal speed for listeners. This means that a text the chatbot would put out on a text-based channel takes more than double the time to be spoken than read! It makes sense to be as brief as possible on the voice channel then, avoiding preambles and if necessary, splitting a text in two or more parts. This is also a good practice on text channels by the way, it’s just easier to get away with longer texts.
3. AVOID TEXT-ONLY EXPRESSIONS
In a text interaction, a chatbot could say something: “please see below for the list of options”. Of course, this does not work for voice, besides being a bit clunky for text too. Much better saying something like “Please choose one of:”. Be on the lookout for this sort of things.


4. AVOID WEB ELEMENTS
Chatbots act on text channels, but also on web pages. An interaction on a web page can be richer than mere text: users can be presented with buttons to click, forms, pictures, even videos. Needless to say, these elements don’t translate well into voice. In this sense an interaction through PhoneMyBot is similar to one that uses a more text-oriented channel like WhatsApp, but even more constrained. So, the recommendation is to avoid all visual elements if possible. Note that PhoneMyBot has settings that allow substituting a string coming from the chatbot with another pre-defined string. For instance, these can be used to look for and substitute a particular URL string (in the form https://some-URL) with a sentence like: “please see our website”. This can be useful, but the best is to avoid substitutions altogether.
5. EXPAND THE EXPECTED UTTERANCES
It is easier to guide users to type something than to say something. There are many more turns of phrase that people use speaking than writing and so the conversational AI engine must be prepared for more. This can be done incrementally: conduct reviews of past interactions, focusing on sentences that were not understood, and see if the conversation can be improved by adding synonyms, turns of phrase, idiomatic expressions to the chatbot knowledge base.


6. USE PHONEMYBOT TAGS TO MAKE THE CONVERSATION MORE LIVELY
Text-to-speech services provide a way to boost their voice output to mimic the way people speak using emphasis, volume, pronunciation etc. PhoneMyBot supports a simplified subset of SSML (Speech Synthesis Markup Language) and takes care of implementing these directions with the TTS services it uses. Chatbots can sound more natural for users by adding tags to their text, which are then interpretedand managed by PhoneMyBot.
Please see https://wiki.phonemybot.com/en/markup-languages/SSML-support for details on the supported tags and how to use them.
7. USE PHONEMYBOT’S FEATURES TO THE MAX
PhoneMyBot provides a set of powerful features to help chatbots transition from chat to voice. They are explained in the PhoneMyBot Wiki (https://wiki.phonemybot.com/). They include features to make the conversation more fluid:
▶ a stock message that PhoneMyBot will say on its own after some time of silence,
▶ a message that PhoneMyBot will say if the speech-to-text engine does not recognize what the user says,
▶ a message said when the chatbot is late sending a reply, an offer to repeat a prompt if it’s long or confusing
▶ a message that PhoneMyBot will say if the user says “no” to an offer to repeat the prompt

Please look at https://wiki.phonemybot.com/key-concepts/Chatbots-service-config to learn how to configure these messages.
In addition, PhoneMyBot can perform actions triggered by the chatbot through a regular expression. In essence, when the chatbot sends a string that’s configured in PhoneMyBot, PhoneMyBot executes the associated action. This can be:
▶ Speak a prompt and hangup the call
▶ Transferring the call to a telephone number associated with a queue or a person
▶ Use one of the context-aware speech-to-text recognition functions for the next utterance that the user says. This is very useful to improve the recognition percentage of numeric or alphanumeric strings, like social security numbers, license plates, etc. See a list of the available contexts here: https://wiki.phonemybot.com/etc/context-list.
▶ Set up special functions during a prompt, like enabling or disabling barge-in (the ability to detect what the user says while the prompt is playing, stopping the prompt and continuing the conversation), or offering the user to repeat the prompt
▶ Substituting a text with another one, for instance to quickly change a text-oriented prompt into a voice-oriented one.
See https://wiki.phonemybot.com/key-concepts/Chatbot-actions-regex to learn how to configure these functions.
Request Demo
WE LOOK FORWARD TO SHOWING YOU HOW OUR SOLUTIONS WORK.
From our blog
LLM-based chatbots and how to make them more reliable
ChatGPT and its siblings are all the rage in customer service chatbots. This is fascinating and terrifying. How do we take the terror out of the equation?In the past year or so we witnessed an explosion of chatbots...
Multimodal interactions: are they breaking through?
Last week I watched a webinar and demo by a company providing tools and solutions for conversational customer service. Interactive Media, where I work, is in the same sector and I wanted to scoop out a competitor, see...
PhoneMyBot and ChatGPT: giving voice to AI
Talking with ChatGPT over the phone is cool. But we can also make it useful. In the past few months tech people worldwide have been talking almost exclusively about Open AI’s ChatGPT. It’s the first large language...
The future of intelligent voice
As the market for smart speakers falters, what are the Big Three (Amazon, Apple, Google) going to do? Alexa, should I bring an umbrella out tomorrow? This is a question that owners of smart speakers have been asking...