Written by Livio Pugliese
February 14, 2022

WhatsApp lets people record and send voice messages. What does it mean for the chatbot customer experience?
Like most Europeans – well, I should say most people in the world – I am a WhatsApp user. WhatsApp has more than 2 billion users worldwide, about a quarter of all humans. And although WhatsApp’s penetration in the United States is lower than in most places, if you are a foreign-born US resident who wants to keep in touch with friends and family back home, like me, WhatsApp is THE app to use.
WhatsApp offers chats, voice calls, video calls, one-on-one or among ad-hoc or organized groups. It also has a business offer, allowing companies to be messaged or called on WhatsApp to be where their customers are.
This feature was introduced in 2018 and is being used more and more: people appreciate using the same app to communicate with individuals and companies, and many telecommunications vendors resell WhatsApp business numbers and the services that come with them.
While I am a member of a couple of organized groups, I mostly use the app to message my friends or call them directly, rarely involving more than one person at a time. But I noticed a funny thing: some of my friends have stopped sending chat messages altogether. Instead, they use another feature of the app, that lets you record a voice message and send it over in a conversation. I prefer to type and let the autocompletion feature on my smartphone work its magic, also considering that receiving a voice message is certainly less immediate than reading a short text. But I can see several reasons for preferring to send a voice recording.
Other Articles
Multimodal interactions: are they breaking through?
Last week I watched a webinar and demo by a company providing tools and solutions for...
PhoneMyBot and ChatGPT: giving voice to AI
Talking with ChatGPT over the phone is cool. But can we make it useful? Everybody is talking about...
For instance, you may be on the go, without the time and place to type. Or you may have troubles seeing the phone keyboard, either because of light conditions or because you can’t see very well (I certainly have problems typing without my reading glasses, I am at that stage of life). You may want to be more expressive using your tone of voice: spoken communication is much better than text to convey feelings. Or you may not be comfortable writing in general – or the person on the other side may have problems reading. For all these reasons, and possibly others that I can’t think of, sending voice messages instead of typing is on the rise.
And this is fine, as long as you communicate with a human who speaks the same language as you. But there is a special use case that is completely destroyed by this habit: communicating with a chatbot. You see, businesses that use WhatsApp to communicate with their customers via text messages often employ chatbots, automatic “conversational AI” attendants that use natural language capabilities to converse with people, understand the reason for the interaction and help them in a more efficient and cheaper way than having a human customer representative on the line the whole time. Except that chatbots can understand WRITTEN communication, and not voice recordings.
Instead, more and more chatbots that connect with WhatsApp receive recorded voice messages. In this case there are two possibilities: the chatbot recognizes that it cannot access the message and dumps the session. Or it transfers the session to a human agent who listens to the message, researches the answer, and writes back. The first case of course brings to an awful customer experience, the second to a substantial increase in costs, as the human agent is doing the job that the chatbot could do, having to listen to sometimes long and rambling messages to extract meaning.
What is there to do? Interactive Media, the company where I work, has launched PhoneMyBot, a service that provides an alternative, cheaper and far more elegant solution to the problem. PhoneMyBot was born to expand the channels available to chatbots to include voice channels. It provides a telephone network interface, along with other voice integrations, transcribing the users’ utterances and sending them to the chatbot, and receiving text in return from the chatbot, transforming it into speech, and sending it back to the user over the voice network. PhoneMyBot is completely cloud-based, and also integrates with a number of contact center suites to transfer the call to a human agent if necessary.
In addition, PhoneMyBot integrates with WhatsApp to receive a recorded voice message in a set language from a chatbot, transcribe it, and send it back to the chatbot as text. All the chatbot has to do is communicate with PhoneMyBot’s WhatsApp number to set the language, send the voice file, and receive the transcription. PhoneMyBot also exposes a standard HTTPS-based API for that, which the chatbot can use with a small development effort.
It may be that the primary reason some people use WhatsApp’s recorded voice messages feature is that they have difficulties reading and writing. You may think this is a problem of the past, overcome now everywhere. But not so fast. The latest figures for United States residents put the non-literacy rate at about 1%. The US is in the middle of the pack here: China (3%), Brazil (7%), India (25%) fare a lot worse. (See https://www.macrotrends.net/countries/ranking/literacy-rate for a complete list). The figures for people who have basic literacy but are uncomfortable reading and writing is likely much higher. So, this is a real possibility.
In addition, PhoneMyBot can also convert the text received from the chatbot to speech (with a choice of voices) and send it back to the chatbot to attach to the WhatsApp response message. This way, users who would like to conduct the complete conversation with recorded messages can receive the chatbot’s answer on their preferred channel.
Sometimes useful features in products and services have unintended consequences. I am sure that when WhatsApp introduced their voice messages feature, they were thinking of human-to-human communications only and for this use case it is a great alternative. But it breaks other use cases, like human-to-machine interactions. Fortunately, PhoneMyBot is there to fix it.
You can try PhoneMyBot’s WhatsApp message transcription right now. To get started scan the code below, fire up WhatsApp on your phone and start the interaction with the word “start” as first message. If you type “help”, PhoneMyBot sends you details on how to use the service.

Other Articles
Multimodal interactions: are they breaking through?
Last week I watched a webinar and demo by a company providing tools and solutions for conversational customer service. Interactive Media, where I work, is in the same sector and I wanted to scoop out a competitor, see what they have and how they are presenting their...
PhoneMyBot and ChatGPT: giving voice to AI
Talking with ChatGPT over the phone is cool. But can we make it useful? Everybody is talking about Open AI’s ChatGPT, at least among the tech people worldwide. It’s the first large language model chatbot to make a splash, and what a splash! It landed with the energy...
The future of intelligent voice
As the market for smart speakers falters, what are the Big Three (Amazon, Apple, Google) going to do? Alexa, should I bring an umbrella out tomorrow? This is a question that owners of smart speakers have been asking since 2013, the year when Amazon released its first...