Treasury News Network

Learn & Share the latest News & Analysis in Corporate Treasury

  1. Home
  2. Fraud Prevention
  3. Minimizing Payment Fraud

AI-driven voice fraud poses a multimillion-dollar risk to organizations

A cloned voice. A convincing phone call. A seemingly routine—or urgent—funds transfer request from a CFO.

In today’s fraud landscape, that may be all it takes for an organization to lose millions of dollars.

For corporate treasury and finance teams responsible for approving payments and managing supplier transfers, the implications are becoming increasingly serious.

Voice-based fraud—once viewed as a niche cybersecurity concern—is rapidly emerging as one of the most sophisticated threats facing corporations. Advances in artificial intelligence (AI) now allow malicious actors to replicate voices, impersonate executives, and manipulate employees in real time. What once required technical expertise and costly equipment can now be executed with inexpensive AI tools and only a few seconds of audio.

For corporate treasury and finance teams, where payment approvals, supplier verification, and executive communications may occur over the phone, the risks are becoming harder to ignore.

Voice fraud is no longer an emerging threat—it is already a real and growing risk

Evidence suggests voice-based fraud has moved from an emerging threat to a widespread operational risk.

The State of Voice-Based Fraud 2026: How Finance and Retail Leaders Are Fighting Back, a recent survey conducted by Modulate and Banking Dive among fraud, risk, IT, and customer experience (CX) leaders across financial services, retail, and insurance, found that 84% of organizations experienced moderately to highly sophisticated voice attacks in the past year.

These attacks come in many forms. According to the same report:

  • 87% of organizations reported fraudulent calls impersonating customers.
  • 84% encountered calls impersonating company leaders or employees.
  • 77% faced voice phishing (vishing) attacks targeting staff.
  • 74% experienced deepfake or voice-cloning incidents.

What makes these attacks particularly dangerous is their realism. AI-generated voices are becoming increasingly indistinguishable from authentic human voices, making traditional verification methods far less reliable.

The scale of the problem is also evident beyond corporate environments. According to Hiya’s State of the Call 2026 report, one in four Americans say they have received a deepfake voice call in the past 12 months, while another 24% say they are unsure they could distinguish it from a real call.

In other words, nearly half the population has either encountered AI voice fraud or cannot confidently identify it.

If individuals struggle to detect such calls, employees responsible for financial transactions may face an even greater challenge.

When the phone becomes a gateway to payments fraud

For corporates, voice fraud is not merely an annoyance—it is increasingly linked to payments fraud.

Attackers frequently use voice impersonation to pressure employees into making urgent financial decisions. In some cases, threat actors mimic the voice of a CEO, CFO, or senior executive and request immediate payment transfers or confidential financial information.

The Modulate and Banking Dive report highlights that voice interactions often serve as a gateway into broader financial systems. Criminals increasingly use voice calls to socially engineer access to digital accounts, bypass identity-verification controls, and manipulate payment processes.

In effect, the phone channel is becoming a proxy payment channel, as observed by Modulate’s experts.

This development reflects a broader shift in fraud tactics. Payments fraud is no longer confined to a single channel but has evolved into a coordinated, cross-channel threat that exploits human behaviour, digital vulnerabilities, and operational blind spots across the finance function.

For corporate treasury teams, that shift means fraud prevention must extend beyond systems and into communication channels as well.

The financial and operational costs are rising

The financial consequences of voice-based fraud are becoming increasingly visible.

According to the Modulate and Banking Dive survey:

  • 53% of organizations estimate the average cost of a voice fraud incident to be between US $5,001 and $25,000.
  • 18% say losses exceed $25,000 per incident.

Yet the direct financial loss is only part of the story.

Enterprises also face significant operational costs when responding to voice fraud. Nearly eight in ten firms spend at least 51 hours per year investigating voice fraud cases, while one in five spends between 201 and 500 hours annually on investigations.

For large institutions with multiple fraud analysts across regions, those hours balloon into millions of dollars spent on investigation alone,” the Modulate and Banking Dive survey report notes.

Voice fraud also places strain on customer-facing operations. The report found that:

  • 44% of organizations report customer complaints about verification processes
  • 39% experience higher call centre volumes linked to fraud concerns
  • 38% report damage to customer trust and reputation, as well as increased training costs

In sectors such as banking and financial services, where trust underpins every transaction, these reputational and operational impacts can be just as damaging as the financial losses themselves.

Strengthening defences against AI-driven voice fraud

Organizations are investing more heavily in voice fraud prevention, but many still struggle to keep pace with evolving threats.

The Modulate and Banking Dive survey identifies the most common controls currently in use. These include:

  • Employee training on social engineering (58%)
  • Multi-factor authentication for phone transactions (49%)
  • Real-time, AI-driven voice analysis and fraud detection tools (49%)
  • Verification callbacks to known numbers (49%)
  • Real-time supervisor monitoring of calls (47%)
  • Call-recording and post-incident analysis (45%)

However, these authentication methods alone may no longer be sufficient. Bad actors are becoming more sophisticated and increasingly share tactics and information across criminal networks at speeds far faster than most organizations can detect or neutralize. AI-powered tools combined with social-engineering techniques are enabling attackers to bypass traditional security checks with greater ease.

Modulate experts argue that effective defence now requires multi-layered detection strategies that combine voice intelligence, behavioural monitoring, network forensics, device validation, and real-time fraud alerts.

In other words, fraud detection must move beyond verifying who is speaking to understanding the context of the conversation itself.

Conclusion

Voice-based fraud illustrates how rapidly the financial crime landscape is evolving.

AI has lowered the barriers for criminals while dramatically increasing the technical sophistication of their attacks. What once required specialized expertise can now be carried out at scale with readily available tools.

For corporates, the implications are clear. The voice channel—long considered one of the most trusted forms of communication—can no longer be assumed secure.

Organizations that strengthen verification processes, invest in advanced fraud detection technologies, and train employees to recognize voice-based manipulation will be better positioned to protect their payments infrastructure.

In the age of AI-generated voices, the next multimillion-dollar fraud may not begin with a hacked system—but with a cloned phone call.

Like this item? Get our Weekly Update newsletter. Subscribe today

About the author

Also see

Add a comment

New comment submissions are moderated.