Voice, gender and character: how they come up with a personality for virtual assistants

Modern voice assistants are increasingly reminiscent of a person. How are virtual assistants endowed with character traits and how does this help the business?

About the expert: Igor Kalinin, founder of TWIN.

Digital Trust

More than 3 billion people regularly use voice assistants – 500 million people use Google Assistant alone. Siri, Alexa, Bixby, Alice, and thousands of other assistants and service bots have come a long way in the last 5-10 years. Synthetic voices and prescribed monotonous scripts are a thing of the past – live dialogues, variable scripts and more advanced speech synthesis and recognition algorithms come to the fore.

IT giants are investing billions of dollars in voice assistants. And companies are ready to spend millions every month on the development and improvement of smart systems for communicating with customers – they no longer want to be content with basic settings. A modern voice assistant – no matter in the b2b or b2c segment – is the most realistic simulation of the interlocutor, with genuine intonations, variability, and most importantly, a set of unique features.

In recent years, even service bots that solve business problems have begun to be endowed with personal characteristics. Of course, they are not trying to tell jokes or tell stories like Siri, but they can show empathy or, conversely, firmness. Psychologists, linguists, storytellers, and professional screenwriters are brought in to work on the scripts, and assistants are increasingly given names, a recognizable voice, and sometimes a visual image.

But why brands?

  • Personalized service helps stand out from competitors and gain an advantage in the market;
  • most companies use bots as a tool for selling other goods and services, so it is important to create a product that users will want to return to again and again;
  • The most important factor is is trust. A bot with a name is more trustworthy than a conventional “Virtual Assistant”. And if he knows how to joke, has his own preferences (even if imaginary), this helps to establish a friendly relationship between the client and the company. Research shows that users who see human features in a bot are usually more satisfied with the service;
  • Voice assistants are one of the most obvious and accessible forms of AI that users interact with. A robot that realistically imitates human speech wins over the user. We have seen this from our own experience when creating bots for business. It turned out that the best development strategy is to use ready-made scenarios, for example, successful dialogues between a real operator and a person. Our task is to mark up the reference dialogue and turn it into a dataset for training the neural network. This method allows you to create believable service bots for any industry, from banking to logistics.

Bot Personality Elements

  • First name

It is difficult for people to communicate with impersonal interlocutors: customers want to imagine who exactly they are communicating with, especially if they are talking over the phone or in a messenger. In the case of bots, a name is remembered better than an abstract number or code. The name becomes synonymous with the brand and at the same time creates the illusion of friendly communication – this increases receptivity. People often hang up when they realize they are talking to an impersonal bot. A helper who introduces himself is more likely to be heard.

For business, AI is the same employee of the company, only virtual, so it must have a name. For example, we developed an assistant for the digital office of IC URALSIB Insurance. Its task is to remind about payments and act as an interactive advisor for the company’s clients and partners. In this case, it was important to give the bot not only a first name, but also a last name – this is how Oksana Sokolova appeared. When the concept of the digital office was approved, the assistant was renamed: in order to smooth the image, Sokolov’s surname was replaced with a softer one – Solovyov. The company also developed a prototype avatar for a virtual advisor.

Voice, gender and character: how they come up with a personality for virtual assistants
Oksana Sokolova (Photo: vc.ru)

Naming for voice services is a science in itself. Most development companies rely on simple and clear names that are easy to pronounce and type. It is also important for business that the name of the bot is remembered, but at the same time does not cause negative associations. Just for this reason, brands most often rely on neutral names – such as Anya, Oleg or Oksana.

However, not everyone uses names. For example, Google Assistant intentionally dropped the name – the company wanted the virtual assistant to be associated with the Google product line, while remaining as neutral as possible.

  • Voice and gender

Most voice assistants at the time of launch had only a female voice, for which their developers were often criticized. UNESCO experts noted that such a format reinforces gender prejudices and assigns women a service role.

This problem needs to be looked at from a developer’s point of view. Studies show that people of different sexes are better at recognizing female voices. In addition, most text-to-speech systems are trained on recordings of female speech, so creating an assistant is easier than an assistant.

In some cases, services are given celebrity voices, again for marketing purposes. Moreover, you do not need to invite a star to the studio – all the work is done by algorithms. For example, WaveNet neural network models helped Google create an imitation of the voice of singer John Legend.

It is interesting that recently there are more and more voices and they are not limited to only female and male. So, Google labels different options not by gender, but by shades – in total, the company has more than 11 different voices only in English. Neutral bots are also emerging, such as Q, who speaks on frequencies that make it difficult to recognize gender.

In the case of bots, a lot depends on the service industry in which they are used. If the bank needs to remind the client about the debt, he can use the male voice. If the courier service offers the customer a discount, then the virtual operator will traditionally be given a female voice. For example, Tinkoff Bank claims that people are more willing to communicate on financial topics with men than with women. Therefore, the team created an assistant named Oleg – a man of 25-40 years old.

  • Character

Character is perhaps the most complex element of any voice assistant. The developers are trying to find a balance between facelessness and individuality, so that the assistant can maintain a lively dialogue, but at the same time does not express his opinion on controversial issues and remains neutral.

To do this, a whole pool of specialists is involved in working on scripts. For example, psychologists work on wording in difficult situations, especially if the interlocutor shares gloomy thoughts with the bot. Conversation designers teach the bot to maintain dialogue and seem natural, while comedians create a semblance of a sense of humor.

However, such characteristics are usually needed for virtual assistants, interaction scenarios with which are constantly changing. When developing service AI for a company, it is important to consider what tasks it solves with the help of a bot and what process it wants to optimize.

Most often, a business strives to make as many calls as possible in a short period of time, so the main thing is savings and efficiency. Service bots are more likely to get in the way of jokes, because they complicate the dialogue and slow down the processing of the call. And reducing the dialogue from 1 minute to 50 seconds can save a lot of resources. Empathy is also not always appropriate – for example, it is not so important for a bank assistant to show sympathy, while a medical bot cannot do without it.

Using ready-made scenarios – reference dialogs – usually simplifies the task and helps to quickly create point solutions tailored to the needs of each company. Although it is now common in the market to create bots from scratch, this is not the most effective strategy. As practice shows, it is better to record a hundred real dialogues and choose two ideal scenarios from them, and then play them back using a neural network.

At TWIN, we always rely on references – we convert conversations with the highest conversion into text, mark up the data, and, based on the resulting base, train the neural network and synthesize speech. This is how realistic service bots are created. In the future, this process can be built automatically – just upload an exemplary reference dialog to the system and click “Convert”.

A well-developed personality is usually not required for a service bot either. It is more important to add an emotional coloring. To do this, dialogue designers can use different techniques: for example, use Carl Jung’s theory of archetypes or give the bot a certain accentuation. Much depends on the interlocutor – for this we, for example, read the mood of the client. The bot or operator will change tactics depending on whether the person is positive, negative or neutral.

How the client perceives the assistant is influenced by many small factors that may seem insignificant from the outside. For example, the correct stress in the name and surname when referring to a client. This complicates the process of speech synthesis, because you need to check the data against the bases and check the stress for each element of the full name. But such details play an important role, because one mistake is enough for the client to instantly lose confidence in the bot.

The Future of the Digital Personality

Despite all the experiments, bots are still far from perfect: they still cannot work autonomously, performing the tasks of a secretary and a personal assistant. Parity between artificial intelligence and a person at the conversational level has not yet been achieved, and creating a virtual interlocutor based on a real one is still a difficult task that many companies are trying to solve.

Obviously, in the future, databases will expand, and it will be easier to train neural networks. A couple of years ago, Facebook researchers launched Persona-Chat, a database of more than 160 conversational snippets from real people. With its help, the company will train computer models.

So far, the primary task of assistants and bots is to help people. To do this, they must understand not only speech, but the emotions and intentions of the interlocutor, as well as solve the problems of the client or business itself. Personal characteristics that positively influence performance will continue to evolve. But creating realistic digital copies of people for business purposes does not make sense yet. The purpose of the existence of a bot is to perform a specific function, and this is its fundamental difference from a person.


Subscribe to the Trends Telegram channel and stay up to date with current trends and forecasts about the future of technology, economics, education and innovation.

Leave a Reply