Voice-based authentication is the biometric identifier of the future

Speech can also be used for biometric identification

Dr. Péter Rónay • Jun 8, 2021

In recent years, communicating with voice interfaces has become a normal part of our lives.
In fact, we’ve already been using numerous voice-activated services, such as identity verification in customer service channels, mobile banking, or even answering queries using digital assistants like Alexa or Google Assistant.

Biometric identification is one of the most important security solutions of the 21st century, with implications reaching far beyond the world of technology, especially for mobile banking or online authentication.

As mobile banking has become as much a part of everyday life as online commerce, a very simple question is becoming more and more important:

“How do we prove that we are us?”

Although the answer seems to have been given long ago by various social and economic sectors, the question is different when asked by an IT professional than when asked by a bank employee or even a legislator.

And when we look at biometrics, a specific area of authentication, the question is modified in this way:

“What physical characteristics can we use to prove that we are us?”

The question may seem innocent at first glance, but in fact it poses serious dilemmas. It raises IT, financial, legal and even philosophical problems, which is why even the largest companies are devoting considerable resources to it.

Think of the number of cases every day in which customer service staff have to use methods of dubious reliability to identify a customer – because anyone can obtain the data they are asked for.

Fintech companies and the banking industry in general are also working on a daily basis to, among other things, increase accuracy and prevent misuse, because what happens if a biometric identification fails? What secondary authentication solutions can guarantee security?

More and more companies are emerging in the market. Leading companies include Nuance, Verint, Pindrop and Aware Biometrics in the US, NICE in Israel and Spitch in Switzerland. Of these, Nuance is perhaps the most famous, as voice biometrics is “just” one of the many technological directions in which they have considerable experience.

Spitch is a global provider of B2B and B2C conversational AI solutions since 2014, headquartered in Switzerland and helping businesses with natural language processing (NLP), artificial intelligence (AI) and machine learning.

Also a major pioneer in voice-based identification technology is Aware Biometrics, whose identity verification and management solutions support financial services, corporate security, healthcare, human resources, people identification, border management, law enforcement, defence and intelligence.

In the following we will focus on a specific sub-area of biometric identification, voice-based identification, and shortly see whether voice can also be used for banking-level identity verification.

What is biometrics? What is biometric identification?

It’s time to define exactly what we consider biometrics to be: human physical or behavioural characteristics that can be used to digitally identify a person to gain access to systems, devices or data.

Examples of biometric identifiers include fingerprints, facial patterns, voice patterns or even gait. Each of these identifiers is considered unique and can be used in combination to ensure greater accuracy of identification.

Why is biometric proof of identity necessary?

The question is a valid one, as humanity has indeed invented many ways of proving that someone is – himself. Indeed, authentication has been relevant since the beginning of history, because for as long as there has been a relationship between human communities, the question of how to prove that one belongs to a particular group, or that they are who they say they are has been an important issue.

Tribal body painting and tattooing can be a good way to do this, but it is easy to see the weaknesses of this solution in the modern age, when anyone can access any kind of paint and pattern.

Why is biometric proof of identity necessary?

There are efforts to implant identification chips under the skin, but for reasons of privacy, they are more commonly used in veterinary medicine.

For roughly three thousand years, the various certificates, seals and signatures have provided a steady livelihood for clever forgers – and caused serious headache for historians trying to verify authenticity of each document.

Identification can be used as proof of identity as well as for material transactions

It is no coincidence that modern identity documents, banknotes and official documents are increasingly protected against forgery. In essence, the same process is at work in all cases.

Individuals and the authorities and banks want to make sure that the customer is who they say they are and that the money is genuine, while the criminals try to make the counterfeit look real.

The online world has brought new problems and solutions

While in the physical world it is relatively easy (albeit complex) to prove one’s identity and competence with a certified ID card, a magnetic stripe card, a card with a chip, etc., in the online space other solutions have had to be sought.

Of course, let’s not forget that printed IDs can also be forged and there is a lot of misuse of lost or stolen cards and documents!

Using passwords was a good idea decades ago

Using passwords seemed like a good idea – at first. After all, when computer capacity is low and there are few places to enter passwords, it is one of the most cost-effective solutions.

But even twenty years ago, one or two passwords were not enough. So three typical problems quickly emerged:

The more places you need passwords, the more people tend to use the same one or two
The more complex a password is, the harder it is to remember, so many people use simple passwords
A lot of passwords have been leaked over the decades due to data breaches

In other words, many people’s data has been accessed because many users use weak passwords, and many of them are the same. What’s more, they are either easily guessed or obtained by unauthorised persons through data leaks.

And the average user is right in his own way: one or two passwords are memorable, but one hundred and fifty-six?

Use passwords – but that’s not enough

Despite all this, passwords are important, just not enough. Moreover, the old idea of filling them with all sorts of complicated symbols and meaningless strings has been a dead end for a long time. It’s really only good for criminals, because it encourages users to write down and store complicated passwords – or use the very easy-to-guess kind.

Logging in with your account at a major service provider (Google, Facebook, etc.) can be a workaround, but if they have access to that account, you’re in trouble. Therefore, two-factor authentication (SMS or app-generated code) is definitely recommended – but even this may not be enough when it comes to financial transactions or even sensitive data.

This is how you get to the point where you can use the one thing you literally always have at hand.

Biometric identification in the present and the near future

There is a more effective – and legally highly problematic – identification solution than ever before: biometrics.

The use of biometrics can make authentication faster, easier and more secure to an extreme degree, but companies and agencies need to be even more careful with the biometric data they collect than with traditional passwords.

Special care is recommended for biometric data

Why? It’s very simple: a password or a signature is easy to change, and you can use a different password everywhere if you want. Changing DNA or fingerprints is “slightly” more difficult.

So if a company were to get our unencrypted biometric parameters, it would be like giving them open access to everything. That is why these identification systems store the data in encrypted form, so that they cannot even intentionally access the raw data – but let’s not get ahead of ourselves.

Biometric identification is one of the most important security solutions of the 21st century.

Biometric identifiers are linked to a person’s intrinsic characteristics

They broadly fall into two main categories: physical identifiers and behavioural identifiers. Physical identifiers are mostly immutable, device-independent, and are classified into the following:

Fingerprint
Palm print
Finger vein
Palmar vein
Hand shape
Face
Iris/retina
Ear
Voice
DNA

Behavioural characteristics have been brought to the forefront by a newer type of biometric approach and are usually used in conjunction with another method due to their lower reliability.

However, as technology develops, these behavioural identifiers may become more prominent

Unlike physical identifiers, which are limited to a specific, fixed “set” of human characteristics, the number of behavioural identifiers is essentially growing as newer and newer devices appear on the market.

Consider the subtle, yet distinctive, differences in typing, mouse movement, mobile phone photography, driving a car, holding a glass, etc. Google’s No CAPTCHA reCAPTCHA can now guess whether a human visitor has arrived at a website based on the behaviour of the mouse pointer alone.

What are the benefits of biometric identification?

Basically, it has two advantages at the same time: it is unique and secure. If fingerprints or voice are sufficient for identification, passwords become unnecessary (possibly retained as an extra layer of security for two-factor authentication) and cannot be hacked or acquired.

With proper automation, a lot of extra services can be built on top of biometric parameters. For example, voice authentication can eliminate the need for an otherwise vulnerable procedure in any call centre when personal data is requested.

It has to mentioned that this kind of soft identification or soft authentication will almost certainly be robotised in the near future.

Or an automated office building based on facial recognition will no longer need an ID card or a fingerprint reader – at the same time, all the possibilities for abuse that can be linked to cards, for example, will be eliminated at a stroke.

How reliable is biometric authentication?

The use of fingerprints, the vascular network of the hand, the retina, facial features or, for example, unique DNA in authentication and identification processes is significantly more secure than a code received on a phone or a plastic card in the hand that stores a user access profile.

The palm of the hand is also a biometric identifier.

Biometric authentication can provide real-time protection by using behavioural analytics technology linked to biometric identification to detect automated threats and attacks (e.g. credentials being hacked for testing, user account theft) in real time and block them with over 99 percent accuracy.

Are there privacy risks associated with biometric authentication?

Some users – understandably – do not want various large companies to have access to information such as when and exactly where their phone is used. And this information is just the tip of the data iceberg for identification.

Biometric data is much more sensitive in nature, as it can also carry additional information about the data subjects that is not otherwise publicly available, such as their origin, gender, disease(s), genetic characteristics, etc.

Provided biometric technologies (e.g. DNA testing) become widespread, they could raise a whole range of privacy issues, including those related to current health status or actual genetic relatedness.

Biometrics can make users too vulnerable

If this information falls into the wrong hands, it can be used for nefarious purposes – just think of tabloid journalists in the case of celebrities, or phishing.

Moreover, the information obtained can be misused by government systems, and even by foreign powers to influence public opinion at will.

And, of course, there are those who simply do not want their family members or bosses to know their every move, or even who their children are.

With proper data protection, there is nothing to fear

At Ergomania, we understand the importance of personal data and that’s why we do everything humanly and technologically possible – as a design company, we don’t deal with biometrics, and our customers also take data protection seriously.

However, voice biometrics has a specificity that is much less pronounced and increasingly appreciated in other types: it can be used without touch, in the dark or in poor visibility.

So it’s time to find out why the human voice has a good chance of becoming one of the primary identifiers of the twenty-first century!

What is voice biometrics and why does it work?

Voice biometrics refers to the technology or process by which speech, as a personal data, is processed from biometric characteristics in order to use the voice to confirm, verify or uniquely identify a person’s identity.

Voice as a unique identifier

A voice is individual in many respects; its rhythm, its pronunciation, the way it is formed (e.g. articulation, accent), the acoustic pattern of the sound it produces, its characteristics are as much a determinant as the physiological characteristics and current state of the speech organs that produce it.

All these factors together allow for a relatively high percentage of individual identification of natural persons. Some biometric identification systems are now able to perform the necessary authentication with an accuracy of less than 2 percent of error.

Every person’s voice is different and unique.

How does voice become biometric?

This is possible because each person’s vocal cords are unique. Physical characteristics, both phonological and morphological, are specific to each individual, making them virtually foolproof.

More than 70 body parts – each with a unique size and shape – contribute to how a person will speak. Just think how different someone’s voice will sound if they lose some teeth, or bite their tongue, or lean forward when speaking.
The four most individual characteristics of a voice are duration, intensity, dynamics and pitch.

In addition to these, there are two main approaches to voice authentication, text-independent, where voice authentication is performed using any spoken cue or other speech content, and text-dependent, where the same cue is used during registration and verification. In the latter case, the speaker is not allowed to say whatever they are thinking at the time, but has to say a predefined sentence.

Advantages of voice-based identification

Voice is a unique characteristic of each person, which is also available more quickly in most cases when personal identification is needed. We can say “apacuka” before we scan our retinas, find a fingerprint reader and put our finger on it, or look into the camera and let the facial recognition algorithm do its job.

When Ergomania started working on the VUI, the voice user interface, back in 2013, it was quickly apparent to us at the beginning that we were about to see…and hear about huge changes.

In times of epidemics, when touching common interfaces (buttons, touchscreens, etc.) should be avoided (or at least involves constant hand disinfection), voice-based control is even a distinct advantage.

There are, however, situations where it is worth combining it with other contactless identification methods – but in such cases the additional solution used (e.g. password entry, fingerprint verification, etc.) should not be at the expense of contactlessness!

But what if someone is hoarse/cold/tired/ill?

During the design of the VUI, it became clear that one of the critical points of the system is the accuracy of voice recognition: i.e. the tolerance with which it can identify someone by their voice.

This is generally a thorny issue for biometric identifiers, i.e. beyond the obvious abuses, there are also cases where the authentication result will be false-positive or even false-negative.

(A false positive/negative result means that the result obtained is an error by the system: it will reject the result incorrectly in the case of a false negative and accept the result incorrectly in the case of a false positive).

What is the reason for this?

For example, the facial recognition system does not always recognise a user wearing heavy make-up or glasses that obscure the face at the identification points, or someone who is sick or tired.

Or the mandatory wearing of masks in the wake of the epidemic, which has made a large proportion of people’s faces unrecognisable. The ideal two-factor identification therefore uses biometric identification in addition to voice-based identification that can be made contactless in an epidemic situation, such as retinalscanning.

There can be just as much variation in sounds depending on how the person is feeling, their health, where they are, or whether they were smoking before they were identified.

People have a different tone of voice when they are not fully awake and also when they are trying to make a phone call in a noisy background, or when they are happy, angry, impatient, drunk, etc.

However, each person’s voice is so different that even when tired or hoarse, it carries these unique identifiable characteristics – for example, even if someone distorts their voice with a machine, analytical software will still recognise it.

Recognition systems alone are therefore not infallible

All it takes is a pre-recorded voice, a copy of a fingerprint, for example, and the system can be circumvented, i.e. it cannot be used for self-identification with certainty.

However, unambiguous identification is particularly important when a fingerprint is matched during an identification process but not the assigned face, or when an account is accessed from an unusual location, from an unknown device, at an unusual time, especially in the case of financial transactions or password changes.

For these very reasons, experts suggest that the obvious solution is for companies to add a second channel of communication, secondary verification or to offer multiple authentication methods at the same time, a complex authentication process solution for their customers and employees.

The voice-based authentication landscape

At Ergomania, we have been working on voice-based projects and designing solutions for years, most recently for example a comprehensive series of interviews with blind and visually impaired people to design more effective VUIs.

Voice-based biometrics is intrinsically linked to voice-based user interfaces, because the ideal situation would be to be able to operate any device or system in a truly contactless way, using only voice – and voice-based identification is essential for this.

But to achieve this requires serious privacy protection solutions. Of course, this has been recognised by security technology companies for some time – ID R&D, for example, specialises in part in detecting voice-based biometric fraud.

Thanks to these developments, misuse is slowly becoming impossible without government or global multinational company-level resources.

The accuracy of speech recognition is also improving: a joint project between HSE University and Nizhny Novgorod State Linguistic University (LUNN) has developed a new solution.

The method involves an algorithm that is resistant to noise of 10 dB or more, which can operate in real time and can have a significant impact on speech recognition, reducing the error rate to around 2%.

For example, Nuance has successfully deployed voice-based identification at several leading banks: Banco Santander has already successfully used voice biometrics to replace PINs, passwords and security questions in their automated telephone system.

And at Virginia Credit Union, voice biometrics has improved both the depositor experience and their own staff satisfaction.

What are the benefits of voice biometric identification?

It is a fair question to ask to what extent (if any) voice-based identification pays for itself, i.e., what ROI is worth considering.

At Ergomania, we have seen in our projects over the last few years that there is a medium-term return on investment, both in terms of increased efficiency and reduced losses, and we have identified seven benefits of voice biometrics:

It enhances the customer experience through fast and accurate authentication
Improves security and minimizes breaches resulting from compromised passwords, phishing, etc.
Reduces security threats by identifying known fraudsters
Allows personalization of interactions
Employees spend less time checking users and resetting passwords
Enables biometric logins through voice-based digital channels, including chatbots and virtual assistants
Can also be used as part of a two-factor authentication process to increase security

In summary, biometric identification, and within it voice biometrics, will become more and more important in the near future, especially in Fintech and banking, and in all areas where customer interaction is present and where there is a need to enhance data security.

You could say that voice biometrics speaks for itself.

About the authors

Dr. Péter Rónay

Senior blog writer

Former university lecturer, copywriter. He writes mainly human and science articles. He is comfortable in the world of smart technologies, renewable resources and green technologies.

Speech can also be used for biometric identification

Why is biometric proof of identity necessary?

Biometric identification in the present and the near future

What is voice biometrics and why does it work?

About the authors

Related posts

The Philosophy of Sentient Banking – Intent and Negotiation instead of Command and Control

Navigating EU Accessibility Standards: A 2025 Guide

How Automation is Shaping Digital Transformation in Banking

Related case studies

Prefixbox

Percapita

Atmen Fintech Mobile App

Sterling Archer

Ergomania and NGOs in the Netherlands

K&H Bank's Corporate Netbank

TreasurUp

Simple by OTP

White Label Banking App

Want to read more about UX, fintech and banking?

User Experience in Digital Payments: Best Practice Guide

Atmen Fintech
Mobile App

Want to read more
about UX, fintech and banking?