After targeting politicians in the UK, Slovakia and other parts of the world, audio scammers have finally made their splash in the US. They cloned President Joe Biden’s voice and turned it into a robocall, a campaign tactic that goes back to the 1970s.

Welcome to 2024, where the political voice you hear on the phone might have been conjured on the internet.

The automated phone message alarmed election experts when it went out over the weekend, playing a voice edited to sound just like Biden, and telling New Hampshire residents not to vote in Tuesday’s Democratic primary. “Save your vote for the November election," it said, before tacking on a Biden catchphrase: “What a bunch of malarkey."

Misinformation researchers are rightly worried about so-called audio deepfakes emerging at the start of a big election year, when roughly half the world will be casting a ballot. While fake videos and pictures are eye-catching and dramatic, fake audio clips are more dangerous. Think of them as the mosquito of misinformation. They’re small and easy to produce, tough to spot and almost impossible to track. And they can spread false information to disastrous effect. Last year, for instance, a Slovakian political party may well have lost a national election because an audio deepfake of its leader went viral two days before the vote.

Governments are well aware of the problem. Biden himself signed an executive order late last year that tries to steer how AI is developed without putting the public at risk. But the genie is already out of the bottle. There are dozens of companies offering tools to clone any voice, including your own or someone else’s, with some more strict about fakes than others.

A British AI company Synthesia, for instance, sells software for making voice and video clones of real people—often for developing corporate training videos—and forbids customers from generating political or news content. To ensure that rules aren’t broken, when customers try to generate videos of their clones, a team of content moderators watches these before they’re fully generated and sent to users.

But other companies don’t police what their customers are making. Another tool called HeyGen went viral last week when someone used it to alter a Davos speech by Spanish-speaking Argentine President Javier Milei. It showed him using fluent English in his own voice while his lips matched the translated words. HeyGen, however, relies on customers to get permission to clone the voices of others, including politicians.

The video of Milei drew an appreciative audience, but it could have gone the other way had his words been misconstrued. By putting the onus on customers to be responsible, HeyGen’s technology seems more vulnerable at this point to misuse than Synthesia’s.

Some AI companies have found themselves playing a game of whack-a-mole to stop people from misusing their systems. ElevenLabs, a popular AI voice-generating service for translating audio books or podcasts, tightened its enforcement efforts last year after people from the web forum ‘4chan’ used it to make deepfake voices of Emma Watson, Joe Rogan and other celebrities saying racist things.

And then even if all the AI companies strictly policed what audio deepfakes were made, bad actors could still turn to open-source alternatives that offer far more freedom. There are plenty of them. One of the most recent projects is a voice-cloning tool called OpenVoice from researchers at the Massachusetts Institute of Technology (MIT), Tsinghua University in Beijing, and members of AI startup MyShell. The tool allows anyone to clone voices “with unparalleled precision… using just a small audio clip," as its creators said on X.

Little wonder that the US Federal Trade Commission recently promised a $25,000 reward for anyone who can come up with a viable solution to the problem of AI voice cloning. So far, there’s no technical fix, as misinformation experts say that new software designed to distinguish cloned voices are still unreliable.

The Biden robocall spotlighted a sobering reality that was already apparent to misinformation experts as well as other countries and political leaders who’ve been targeted with AI generated voices. With varying policies, remarkably lax enforcement rules from platforms like Facebook and a growing pool of free tools that scammers can use with impunity, voice cloning will grow and our institutions will have to grapple with the chaos.

For now and possibly for a long time to come, the onus will once again be on us to be more adversarial and cautious towards what we hear—even when it’s on our very own phones. ©bloomberg