Deepfake Voice
AI-generated audio that mimics a specific person's voice, used to impersonate them in fraud calls or authorise fraudulent transactions.
Also known as: voice cloning fraud, AI voice fraud, synthetic voice scam
Last reviewed: 1 June 2026
Deepfake voice (also called voice cloning or AI voice fraud) uses machine learning models trained on samples of a target individual's speech to synthesise new audio that sounds convincingly like them. Publicly available voice samples from social media, podcasts, interviews, or earnings calls provide sufficient training data for modern tools to produce realistic imitations.
In fraud contexts, deepfake voices are used to impersonate executives in authorisation calls (a variant of CEO fraud), to deceive family members in grandparent or emergency scams, or to defeat voice biometric authentication systems. A notable attack vector involves a synthesised voice instructing a finance team member to approve an urgent wire transfer during a supposed video or phone call.
As synthesis quality improves and generation costs drop, voice deepfakes are accessible to low-sophistication attackers. Defences include establishing verbal code-words for high-value authorisations, requiring multi-channel confirmation (not just a call) for large transfers, and implementing liveness and anti-spoofing checks in voice authentication systems.
Examples
- A financial controller receives a call from what sounds exactly like the CFO, requesting an urgent overseas transfer; post-transfer investigation reveals it was a synthesised voice clone created from publicly available interview recordings.