Written by Interactive Media
August 13, 2021

Chatbots converse with people in natural language and have had an extraordinary proliferation in the past few years. They started as little windows on websites, allowing users to write what they were looking for and providing information directly, instead of forcing people to navigate the complete site looking for their content.
This is certainly a worthwhile mission, but chatbots have expanded from that, to mining databases and presenting personalized results, and performing mission-critical activities like booking and confirming appointments.
But the past few years have also seen the explosion of mobile messaging services, which are now an integral part of (almost) everyone’s life. From simple one-on-one text messages (SMS) to multimedia messages to multiple recipients and platform that straddle the divide between messaging and social networks, like WhatsApp, Viber, Telegram, Facebook Messenger. The advantages of these services are clear: they are software-only and brought to users on a device that’s always with them, they are free or almost free, they offer multimedia capabilities, and writing texts is faster and more flexible than calling. Even though it is less common in the USA, WhatsApp (owned by Facebook) is currently the biggest mobile messaging app in the world, with about 2 billion users and about 100 billion messages sent per day.
Other Articles
Multimodal interactions: are they breaking through?
Last week I watched a webinar and demo by a company providing tools and solutions for...
PhoneMyBot and ChatGPT: giving voice to AI
Talking with ChatGPT over the phone is cool. But can we make it useful? Everybody is talking about...
And so, people send and enjoy messages at an ever-increasing rate. Of course, chatbots are also in the mix, following their audience to the channels that they use. This way, people can get services from chatbots on their favorite messaging app, just like they were messaging with friends.
Chatbots work on text and all messaging applications are based on text. They all support pictures and videos, which are transferred as text-based links that the app follows to retrieve the content or attachments. Chatbots can connect to any messaging platform with an API that allows it, simulating a mobile device or implementing a business endpoint.
All good then? Not completely. A functionality offered by some messaging platforms is to record a voice message instead of typing and send it instead of (or together with) a text message. This is becoming more and more common – people on the move may not want to stop and type, while recording a brief message is fast and easy. It is also more personal: you can say a lot more with your tone of voice than sending text and emojis. Humans also appreciate to hear their friends voice more than just reading what they write.
But not chatbots. For them, a recorded voice message in a text exchange means the end of the conversation: they are not (in general) equipped for receiving a voice file and transcribing it into text to feed to the conversational AI engine that propels the conversation. The alternative, that can be used in high-value conversations like sales or customer support ones, is to transition the interaction to a human agent who will listen to the voice message and reply back, taking over the exchange with the user. But this is expensive as it requires the organization to staff humans in a sufficient number to pick up failed bot conversations in addition to conducting their normal business.
Even worse would be for human agents to simply listen and transcribe the message to pass it back to the chatbot: this would be an impossibly dull and menial job and likely to lead to massive turnover.
What is needed is a service to transcribe voice recordings and get them back to chatbots accurately and quickly.
PhoneMyBot from Interactive Media provides such a service. PhoneMyBot is dedicated to expanding the chatbots realm to voice, be it from the telephone network or any other channel. For the telephone channel, PhoneMyBot must transform live voice from a user into text and text from the chatbot into voice. All of this, in several languages and with a selection of the best speech-to-text service for the job. This also enables PhoneMyBot to spot-transcribe recorded messages.
A crucial point is to make it very easy for chatbots to submit a recorded voice message to transcribe. PhoneMyBot exposes a RESTful API for this, supporting numerous encodings and formats for the voice file. Considering that most users are on WhatsApp and so chatbots also use this channel, PhoneMyBot also provides a WhatsApp enabled number for access. Chatbots can send a message to PhoneMyBot with the voice file and receive back the transcription as the response.
With this feature, we of PhoneMyBot believe that we gave a definitive answer to the recorded voice messages dilemma.
Other Articles
Multimodal interactions: are they breaking through?
Last week I watched a webinar and demo by a company providing tools and solutions for conversational customer service. Interactive Media, where I work, is in the same sector and I wanted to scoop out a competitor, see what they have and how they are presenting their...
PhoneMyBot and ChatGPT: giving voice to AI
Talking with ChatGPT over the phone is cool. But can we make it useful? Everybody is talking about Open AI’s ChatGPT, at least among the tech people worldwide. It’s the first large language model chatbot to make a splash, and what a splash! It landed with the energy...
The future of intelligent voice
As the market for smart speakers falters, what are the Big Three (Amazon, Apple, Google) going to do? Alexa, should I bring an umbrella out tomorrow? This is a question that owners of smart speakers have been asking since 2013, the year when Amazon released its first...