Deepfake Scams: AI Voices and Videos Cost Millions

By Tobias Massow · March 8, 2026

2 min read

⏱ 7 min read

In February 2024, a finance executive in Hong Kong transferred $25 million after a video call with what he believed to be his CFO and several colleagues. All participants were deepfakes. This case was not an isolated incident: Deloitte estimates global losses from deepfake fraud will reach over $200 million in 2025, with a doubling every 12 months.

Key Takeaways

$25 million in a single attack: Deepfake video calls with AI-generated voices and faces deceive even experienced employees (Hong Kong case, 2024).
Voice cloning in seconds: With just 3 seconds of audio, AI can clone a voice convincingly – CEO speeches, podcast appearances, and LinkedIn videos provide the material.
Procedural safeguards are the best defense: Four-eye principle, callback verification, and out-of-band confirmation stop 95% of deepfake attacks.

How Deepfake Attacks on Businesses Work

Deepfake-enabled fraud follows a clear pattern: The attacker researches the target organization, identifies decision-makers (CEO, CFO, executives), and gathers publicly available audio and video material. LinkedIn posts, podcasts, YouTube videos, and conference presentations provide enough material for convincing clones. For more details: Deepfake Fraud.

Voice Cloning is the lowest barrier to entry. Tools like ElevenLabs, Resemble.AI, and open-source alternatives (TortoiseTTS, XTTS) can clone a voice with as little as 3 to 10 seconds of audio, making it indistinguishable from the original in a phone call. The most common scenario is an “urgent transfer” request from the “CEO” in a call.

Video Deepfakes are technically more demanding but will be possible in real-time by 2026. Tools like DeepFaceLive enable face-swapping in video calls. The Hong Kong case demonstrated that even a video conference with multiple participants can be fully faked.

The combination of voice cloning and email spoofing is particularly dangerous: An email “from the CFO” announces a call, which comes with the cloned voice – there is no obvious reason for the recipient to doubt the authenticity.

3 sec

Audio for voice cloning

$200 million

Estimated loss 2025

x2 / year

Incident doubling

Technical Detection: What Works – and What Doesn’t

Deepfake detection is an arms race. Detection tools analyze artifacts in audio and video – unnatural lip movements, inconsistent lighting, frequency anomalies in the voice. Providers like Pindrop (audio), Reality Defender, and Intel FakeCatcher offer enterprise solutions.

The problem: Detection lags behind generation. Current deepfake models produce outputs that are indistinguishable from real recordings to the human eye and ear. Automated detection tools achieve recognition rates of 85 to 95 percent – which sounds good, but means that 5 to 15 percent of fakes slip through. This is enough for a targeted attack on a single organization.

Therefore, technical detection is a layer, not the solution. The real defense lies in processes and organizational culture.

“Deepfake detection will never reach 100 percent. Organizations must design their processes so that a single deepfake – no matter how convincing – cannot cause harm.”
– Vijay Balasubramaniyan, CEO Pindrop (CES 2025)

Protective Measures: Processes That Neutralize Deepfakes

The most effective countermeasures are not technical but organizational:

Four-Eyes Principle in Financial Transactions: No transfer over 10,000 Euro without approval by at least two authorized persons. No single call or video call should trigger a payment – regardless of who it comes from.

Out-of-Band Verification: If a call comes from the “CEO,” it is verified through a separate channel – call back to the known mobile number, send a Signal message, or personal visit to the office. The attacker can spoof one channel, but not all at once.

Code Words for Crisis Situations: A pre-agreed code word that must be mentioned in every urgent payment instruction. Simple but surprisingly effective – the attacker does not know the code word.

Training with Real Examples: Staff in financial departments must know that deepfake calls exist and how convincing they can be. Live demos with cloned voices of leaders (with their consent) are the most effective awareness approach.

Technical Supplement: Email authentication (DMARC, DKIM, SPF) prevents spoofing of sender addresses. Deepfake detection tools in phone systems and video conferencing solutions provide an additional layer. And: Reducing publicly available audio/video material of leaders complicates voice cloning.

Key Takeaways at a Glance

Frequently Asked Questions

Every question is locked. A tap unlocks the answer.

How do I detect a deepfake call?

Look for: unusual urgency, deviations from normal conversational style, background noises that do not match the alleged location, and the request to bypass established approval processes. If in doubt: hang up and call back on a known number.

Can anyone clone a voice?

Yes. The tools are freely available, sometimes as open source (TortoiseTTS, XTTS), sometimes as commercial services (ElevenLabs for $5/month). 3-10 seconds of audio material is enough for a convincing clone. The entry barrier is minimal.

Are deepfake attacks criminal?

Yes. In Germany, deepfake-based fraud attempts fall under § 263 StGB (fraud), and possibly § 269 StGB (forgery of evidence). The EU is working on specific deepfake regulation as part of the AI Act. Prosecution is difficult, however, as attackers often operate from abroad.

What does deepfake detection cost for businesses?

Enterprise solutions like Pindrop (audio deepfake detection) cost from 50,000 Euro annually. Reality Defender offers API-based detection from 20,000 Euro/year. For most businesses, procedural measures (four-eyes principle, callback verification) are more cost-effective and effective.

How do I protect my own audio/video material?

Completely preventing cloning is impossible if public appearances exist. However: reduce unnecessary material (does every keynote need to be on YouTube?). Use audio watermarking services. And most importantly: accept that cloning is possible and secure your processes accordingly.

Which industries are particularly at risk?

Financial sector (high transaction values), real estate sector (large individual payments), law firms (trust accounts), and internationally operating companies with distributed teams (difficult personal verification). Generally: Any organization where individuals can make large payments.

Editor’s Reading Tips

Source Title Image: Pexels / Markus Winkler

Deepfake Scams: AI Voices and Videos Cost Millions

How Deepfake Attacks on Businesses Work

Technical Detection: What Works – and What Doesn’t

Protective Measures: Processes That Neutralize Deepfakes

Key Takeaways at a Glance

Frequently Asked Questions

Related articles on the topic

Further reading in the network

Editor’s Reading Tips

Further reading

When Attackers Are Faster Than the Patch

WhatsApp in the Workplace: Which Messenger Will Replace It

Post-Quantum becomes mandatory in cloud certification