Using PhoneMyBot to add voice to a chatbot

> BEST PRACTICES <

PhoneMyBot is Interactive Media’s service that allows chatbots – conversational AI applications that use text or web channels to communicate with users – to add the voice channel and have voice conversations with people quickly and easily. This could be over the phone or other voice networks. PhoneMyBot takes care of converting voice into text and text into voice, with a number of features to make this process easier and more pleasant for users, but chatbots should still be mindful of the ways voice conversations differ from text-based ones and adjust their side for a better conversational voice experience.

This document provides a number of tips to refine chatbots dialog to better adapt to voice.

 1.   START BEING SURE OF YOURSELF

One of the reasons people install voicebots is to lighten the load on human agent who receive telephone calls from customers. There is a good chance that the voicebot will be able to help servicing the most common tasks. A call should be forwarded to a human agent only if the voicebot cannot service the call. So, don’t start by telling a user that they can talk with a human by saying “human agent” at all times: only do that when it’s clear that the voicebot will not be able to help. Otherwise, that’s exactly what most people will do: saying “human agent“ immediately.

 2.  BE AS BRIEF AS POSSIBLE

The average reader can read about 300 written words per minute, but talking is slower. Normally podcast readers speak 150-160 words per minute, as this is the optimal speed for listeners. This means that a text the chatbot would put out on a text-based channel takes more than double the time to be spoken than read! It makes sense to be as brief as possible on the voice channel then, avoiding preambles and if necessary, splitting a text in two or more parts. This is also a good practice on text channels by the way, it’s just easier to get away with longer texts.

 3.  AVOID TEXT-ONLY EXPRESSIONS

In a text interaction, a chatbot could say something: “please see below for the list of options”. Of course, this does not work for voice, besides being a bit clunky for text too. Much better saying something like “Please choose one of:”. Be on the lookout for this sort of things.

 4. AVOID WEB ELEMENTS

Chatbots act on text channels, but also on web pages. An interaction on a web page can be richer than mere text: users can be presented with buttons to click, forms, pictures, even videos. Needless to say, these elements don’t translate well into voice. In this sense an interaction through PhoneMyBot is similar to one that uses a more text-oriented channel like WhatsApp, but even more constrained. So, the recommendation is to avoid all visual elements if possible. Note that PhoneMyBot has settings that allow substituting a string coming from the chatbot with another pre-defined string. For instance, these can be used to look for and substitute a particular URL string (in the form https://some-URL) with a sentence like: “please see our website”. This can be useful, but the best is to avoid substitutions altogether.

 5.  EXPAND THE EXPECTED UTTERANCES

It is easier to guide users to type something than to say something. There are many more turns of phrase that people use speaking than writing and so the conversational AI engine must be prepared for more. This can be done incrementally: conduct reviews of past interactions, focusing on sentences that were not understood, and see if the conversation can be improved by adding synonyms, turns of phrase, idiomatic expressions to the chatbot knowledge base.

 6. USE PHONEMYBOT TAGS TO MAKE THE CONVERSATION MORE LIVELY

Text-to-speech services provide a way to boost their voice output to mimic the way people speak using emphasis, volume, pronunciation etc. PhoneMyBot supports a simplified subset of SSML (Speech Synthesis Markup Language) and takes care of implementing these directions with the TTS services it uses. Chatbots can sound more natural for users by adding tags to their text, which are then interpretedand managed by PhoneMyBot.
Please see https://wiki.phonemybot.com/en/markup-languages/SSML-support for details on the supported tags and how to use them.

7. USE PHONEMYBOT’S FEATURES TO THE MAX

PhoneMyBot provides a set of powerful features to help chatbots transition from chat to voice. They are explained in the PhoneMyBot Wiki (https://wiki.phonemybot.com/). They include features to make the conversation more fluid:

a stock message that PhoneMyBot will say on its own after some time of silence,

a message that PhoneMyBot will say if the speech-to-text engine does not recognize what the user says,

a message said when the chatbot is late sending a reply, an offer to repeat a prompt if it’s long or confusing

a message that PhoneMyBot will say if the user says “no” to an offer to repeat the prompt

Please look at https://wiki.phonemybot.com/key-concepts/Chatbots-service-config to learn how to configure these messages.

In addition, PhoneMyBot can perform actions triggered by the chatbot through a regular expression. In essence, when the chatbot sends a string that’s configured in PhoneMyBot, PhoneMyBot executes the associated action. This can be:

Speak a prompt and hangup the call

Transferring the call to a telephone number associated with a queue or a person

Use one of the context-aware speech-to-text recognition functions for the next utterance that the user says. This is very useful to improve the recognition percentage of numeric or alphanumeric strings, like social security numbers, license plates, etc. See a list of the available contexts here: https://wiki.phonemybot.com/etc/context-list.

Set up special functions during a prompt, like enabling or disabling barge-in (the ability to detect what the user says while the prompt is playing, stopping the prompt and continuing the conversation), or offering the user to repeat the prompt

Substituting a text with another one, for instance to quickly change a text-oriented prompt into a voice-oriented one.

See https://wiki.phonemybot.com/key-concepts/Chatbot-actions-regex to learn how to configure these functions.

Request Demo

WE LOOK FORWARD TO SHOWING YOU HOW OUR SOLUTIONS WORK.

Privacy Policy

From our blog

The future of intelligent voice

The future of intelligent voice

As the market for smart speakers falters, what are the Big Three (Amazon, Apple, Google) going to do? Alexa, should I bring an umbrella out tomorrow? This is a question that owners of smart speakers have been asking...

read more

Interact with us

Subscription

Receive our exclusive content: