After years of pilots and proofs of concept, the boardroom conversation around AI has shifted. Executives are no longer asking whether AI can transform banking. They're asking when it will start delivering returns.
The mandate is clear: show ROI, or risk falling behind.
The pressure is mounting. 61% of CEOs say they face increasing pressure to demonstrate returns on AI investments compared to a year ago. Yet the gap between expectation and reality remains stark. Only 14% of CFOs report measurable ROI from AI initiatives.
Meanwhile, most banks are stuck somewhere between the promise and the proof: only one in four banks worldwide is actively using AI to gain competitive advantage. More than half have piloted agentic AI, yet just 16% have moved use cases into production. The rest - 84% of the industry - are still running experiments that never translate into value.
The pilot trap: why banks get stuck in AI pilot mode
Pilots feel like progress. They generate excitement, showcase innovation, and create the appearance of momentum. Marketing experiments with content generation. Operations tests chatbots. Risk explores fraud detection models.
Each initiative shows promise in isolation, but without central oversight, they create an ungovernable patchwork - tools that can't talk to each other, capabilities stuck in silos, and a fragile foundation that collapses under the weight of real-world complexity.
Mistake #1: the efficiency-only portfolio
Most institutions approach AI primarily through an efficiency lens. Cost reduction is tangible, measurable, and easy to model. So banks load up on use cases designed to cut headcount, reduce call center volume, and automate routine tasks.
One U.S. bank was running fourteen chatbot pilots simultaneously, all focused on reducing costs. Not a single one addressed onboarding, fraud prevention, or revenue generation.1
This creates an unbalanced portfolio. Efficiency gains matter, but they represent only one lever.
The banks seeing the greatest returns have built balanced portfolios spanning three categories: efficiency, growth, and resilience. Each category reinforces the others, creating compound returns that efficiency-only strategies can't match. Contact center copilots reduce average handle time. Personalized nudges increase funded accounts and deposits per customer. Fraud agents cut losses and strengthen compliance.
Banks don't need fifty pilots. They need the right five, executed well and scaled deliberately. Studies from McKinsey and Accenture consistently show that 75% of AI value comes from fewer than 10 to 15 high-impact use cases.1
The impact of getting this balance right is measurable: 15 to 20% higher returns on efficiency, 20% increases in revenue per client, and 10% reductions in fraud losses. Combined, that represents a 45% swing in P&L terms.1
Mistake #2: building on broken foundations
AI needs modern foundations. Without them, pilots stay pilots. No amount of sophisticated machine learning can compensate for infrastructure that was designed for a different era entirely.
Banks find themselves trying to build the future while simultaneously keeping decades-old systems operational. 90 to 95% of IT spending remains locked in legacy maintenance rather than innovation.1 Each AI initiative becomes harder to deploy than the last because it must navigate legacy architecture, fragmented data flows, and systems that were never designed to work together.
The banks moving fastest have recognized a fundamental truth: you can't scale AI on foundations built for batch processing and siloed data. They're modernizing the platform first, then deploying use cases on top of it.
Organizations with proper AI platforms report reusing 50 to 60% of their work when building subsequent use cases.1 Deployment timelines shrink from months to weeks. Costs come down as capabilities are spread across use cases. Scale becomes possible - not because the technology improved, but because the foundation finally supports it.
Mistake #3: metrics that don't matter to the CFO
The path from pilot to profit runs through the CFO's office. Yet many AI initiatives can't connect to metrics that matter at the board level.
Use cases that define P&L impact fall into three categories: revenue uplift, operational efficiency, and risk resilience.
The banks that scale AI successfully don't treat these categories as separate initiatives. They build integrated strategies where efficiency gains free up capacity for growth investments, while resilience improvements protect the value being created.
More importantly, they force discipline by connecting every use case to a specific metric the CFO tracks: Nudges for deposits and engagement connect to funded accounts and deposits per customer. Wealth copilots tie to assets under management per adviser. Contact center automation maps to average handle time and cost to serve. Fraud agents demonstrate reductions in losses and compliance costs.1
This approach transforms the conversation. It's not enough to show that an AI model works in a controlled environment. The question is whether it moves a metric that executives track and care about.
The pattern that works: what leading banks do differently
Leading banks are proving value in 90 days. Not through shortcuts or superficial implementations, but through focused execution on high-impact use cases built on modern foundations. They select the right pilots, connect them to the right metrics, and scale them through the right operating models.
JPMorgan Chase's AI ROI is heading toward $2 billion. Their generative AI platform is rolled out to more than 200,000 employees - 125,000 use it daily. Morgan Stanley's coding agent saved developers 280,000 hours.
These aren't experimental results from controlled pilots. They're production-scale deployments generating measurable business impact.
The institutions pulling ahead share four characteristics that separate execution from experimentation:
First, they've moved beyond pilots to transform critical business areas. They're deploying use cases that move the needle on revenue, efficiency, and risk. The focus is narrow. The execution is disciplined. The impact is measurable.
Second, they've adopted centralized operating models that enable coordination and scale. Without a platform to build on and a coordinated strategy to follow, individual pilots remain exactly that - individual, isolated, and ultimately inconsequential. Research shows that 70% of banks with highly centralized AI operating models have progressed to putting use cases into production. Only 30% of those with fully decentralized approaches have achieved the same.
Third, they've built platforms rather than point solutions. Foundational capabilities get built once and reused across multiple use cases, enabling 50 to 60% reuse rates and faster deployment cycles. This architectural discipline creates compounding advantages as each new use case becomes easier to deploy than the last.
Fourth, they've connected every initiative to metrics that matter at the board level. Every use case maps to a CFO-tracked metric. No fuzzy ROI. No ambiguous value propositions. Just clear lines of sight from AI investment to business impact.
1 Backbase Value Consulting, AI ROI Blueprint Webinar, 2025.




