In the always escalating war between hackers and the rest of us on cybersecurity, the bad guys have a scary new tool: If they can get even a few sentences of recorded speech from someone, they can simulate that person's voice realistically and have it say anything they want.
At the moment, these voice fakes are primarily being used to fleece elderly people by generating anguished calls supposedly from children or grandchildren who are in a predicament and need several thousand dollars wired to them immediately. And it's easy to see the potential for broader abuse.
Email scams that simulate messages from senior executives routinely con employees in finance departments into wiring hundreds of thousands or even millions of dollars to a "supplier" or "customer" that is actually a hacker bank account. That scam has happened often enough that many finance employees have learned to be careful, but now imagine if the hacker could follow up on that urgent email with a distressed voicemail message that sounds like the CFO demanding to know why the employee is being so slow.
An article in the Washington Post explains:
"Powered by AI, a slew of cheap online tools can translate an audio file into a replica of a voice, allowing a swindler to make it 'speak' whatever they type. Experts say federal regulators, law enforcement and the courts are ill-equipped to rein in the burgeoning scam. Most victims have few leads to identify the perpetrator, and it’s difficult for the police to trace calls and funds from scammers operating across the world. And there’s little legal precedent for courts to hold the companies that make the tools accountable for their use.
“'It’s terrifying,' said Hany Farid, a professor of digital forensics at the University of California at Berkeley. 'It’s sort of the perfect storm … [with] all the ingredients you need to create chaos.'”
Even a year ago, the article says, a lot of audio was needed to clone a person's voice. Now, just recording a TikTok where you talk for 30 seconds is enough to let someone clone your voice using a tool that costs as little as $5 a month.
Based on so little audio, the tool can't replicate the mannerisms of a speaker or the language they would use, such as nicknames, if the person being impersonated was actually making the call. But the AI-generated voice sounds so much like the real person that people can be fooled, especially in a stressful situation where speed is demanded.
Authorities say that, as usual, the best recourse for potential victims is caution: Be very suspicious of any urgent request, and always verify with the person supposedly making the request that it's really from them, whether in a personal or business setting.
Eventually, tools will be developed that will help test whether a voice is being generated by AI, just as tools can now help test whether text or deep fake images come from AI... but the bad guys will probably be on to the next scam by then.
Insurers have done an increasingly good job helping cyber insurance customers be more secure in the face of unrelenting attacks by hackers -- a topic we explore in detail in this month's ITL Focus, on cyber -- but they will obviously have to stay vigilant. There's no rest for the weary.
Cheers,
Paul
P.S. The latest scam comes in the context of continual improvement in what's known as generative AI. Last week, OpenAI released GPT-4, a new version of ChatGPT, just four months after it released ChatGPT and captured the world's imagination. The Atlantic does a good job of explaining what's new in this article, from March 14.
The gist:
"Rumors and hype about this program have circulated for more than a year: Pundits have said that it would be unfathomably powerful, writing 60,000-word books from single prompts and producing videos out of whole cloth. Today’s announcement suggests that GPT-4’s abilities, while impressive, are more modest: It performs better than the previous model on standardized tests and other benchmarks, works across dozens of languages, and can take images as input—meaning that it’s able, for instance, to describe the contents of a photo or a chart....
"From what we know, relative to other programs, GPT-4 appears to have added 150 points to its SAT score, now a 1410 out of 1600, and jumped from the bottom to the top 10 percent of performers on a simulated bar exam."