AI Implementation Challenges for Banks Explained

Why AI initiatives keep stalling inside banks

Most banks have run AI pilots. Most of those pilots haven't scaled. The common diagnosis points to data quality or talent gaps, but those explanations miss the structural problem sitting underneath. Fragmented frontline infrastructure means AI agents operate without a unified view of the customer. They follow inconsistent rules across systems and write results back to different places. The result is the same disorder banks already had, now executing faster and with less oversight.

The fragmentation problem is harder to see because it hides in the whitespace between systems - handoffs, exceptions, and manual coordination that no single system owns. Around 50% of frontline banking work lives there. This is precisely where AI opportunities are highest, and also where operational risk concentrates. Dropping AI agents into that environment without a coordination layer doesn't reduce that exposure, it multiplies it.

Every challenge covered in this post connects back to that structural reality. Banks have the ambition. What they're missing is the coordination infrastructure agents need to act on it - a control plane that gives agents unified customer context, consistent policy authority, and a shared source of truth. Until that exists, AI at scale stays out of reach.

Challenge 1: Fragmented frontline infrastructure turns AI agents into chaos engines

Most banks deploy AI agents on top of existing infrastructure without changing the infrastructure itself. That decision has a predictable result. Each agent pulls customer data from a different source, applies rules that don't align with what a parallel agent is doing, and writes outcomes back to separate systems. The result is the same disorder banks already had, running faster and with less visibility into what went wrong.

The problem is structural. An AI agent can only act well when it has unified customer context, a shared source of truth, and clear authority to make decisions - what we see consistently absent across the 120+ bank implementations we've worked through. Fragmented infrastructure denies all of these. Without a single consistent view of the customer, an agent makes decisions on partial data. Without shared rules, two agents handling the same customer reach different conclusions. Without a defined write-back target, no system of record stays accurate, and governance becomes impossible - not harder, impossible.

According to McKinsey, most AI transformations stall at scale due to missing operating model foundations rather than model capability. Adding more capable models on top of disconnected systems doesn't resolve that - it deepens it.

Challenge 2: Legacy architectural mismatch goes deeper than technical debt

Most conversations about legacy systems focus on maintenance costs and integration timelines. That misses the real problem. As Valbona Dhjaku put it on the bankingReinvented podcast: "The real challenge in my expertise is much deeper. Most banks, as we know, in Albania, not only in Albania maybe, across markets, still rely on legacy monolithic core systems that were designed, built in a time where the current way of processing payments did not exist." That isn't a maintenance problem. It's a foundational architectural mismatch between what the core was built to do and what modern banking requires.

When you layer AI agents on top of a core that was never designed for real-time payment processing, you don't fix the mismatch, you accelerate it. Each agent operates without unified customer context. There's no single authority to write decisions back to. Policy enforcement depends on whichever system happens to be queried first. The result isn't automation - it's disconnected execution at higher speed.

The structural fix isn't replacing the core. It's building a control plane above it - one where every actor, human or AI, reads from and writes back to the same record, under the same rules. Without it, the architectural problem that existed before AI simply becomes harder to govern once agents are running inside it.

The whitespace problem where 50% of frontline work hides

About half of all frontline banking work happens in the whitespace between systems. Handoffs, exceptions, manual coordination steps - none of these belong to any single platform. No CRM owns them. No core banking system tracks them. They exist in emails, spreadsheets, and verbal agreements between colleagues. This is where operational risk accumulates quietly, and where, across more than 20 years building with banks, we've seen the most consistent blind spots in AI deployment planning.

It's also exactly where AI opportunity looks most attractive. Automating exception handling and cross-team coordination sounds like an obvious win. But AI agents can only act on what they can see, verify, and write back to a shared record. In fragmented infrastructure, none of that is available. Each agent operates on a partial view of the customer, with no single source of truth and no system holding the authority to commit a decision across the full interaction chain.

The result isn't slow automation, it's ungoverned action at speed. An agent resolving a complex exception in the whitespace - without unified customer context or authorized decision authority - can't be audited, reversed, or governed. Compliance teams can't reconstruct what happened or why. That makes AI deployment in the whitespace the highest-stakes coordination failure in banking operations today. The problem isn't agent capability, it's the absence of a coordination layer the agent can trust.

Ungoverned agent authority makes compliance structurally impossible

Jouk Pleiter puts the stakes directly: "If you don't solve the guard function, I don't see AI at scale in banks at all. I basically see the risk and compliance argument paralyzing innovation." That warning isn't about regulators being unreasonable. It's about what happens when AI agents operate without defined authority, policy scope, or a single source of truth to write decisions back to. Compliance becomes structurally impossible - not just harder to manage.

The problem is architectural. An AI agent needs unified customer context, a shared source of truth, and explicit decision authority to act in a governed way. Fragmented frontline infrastructure provides none of these reliably. When an agent pulls context from one system, checks policy in another, and writes outcomes to a third, there's no auditable chain of authority. Regulators can't examine what they can't trace. Compliance teams can't approve what they can't reconstruct.

Banking OS introduces a third actor class alongside customers and employees - AI agents. That requires banks to define what each agent is authorized to do, under what conditions, and within what limits. This governance layer isn't a process added on top of existing infrastructure, it's a property of the infrastructure itself. Without a control plane that coordinates authority across all three actors, every agent deployment is a compliance risk waiting to materialize - and that's exactly the argument that kills AI programs before they scale.

Challenge 5: vendor fragmentation means no single agent ever has the full picture

Most banks source AI tools from multiple vendors. One vendor handles customer service automation, another runs credit decisioning, a third powers the advisor desktop. Each agent operates on its own data slice, follows its own rule set, and writes results back to a different system. No agent ever sees the complete customer picture, and that's not a procurement oversight - it's a structural coordination failure.

When agents work from partial data, they act on incomplete context. When they follow inconsistent rules, policy authority breaks down. When they write back to disconnected systems, the source of truth fractures further with every interaction. The outcome isn't automation at scale, it's the same disorder produced faster, with more decisions made on shakier ground than any human process before it.

Governance becomes structurally impossible in this environment. Gartner identifies lack of AI governance infrastructure as a primary barrier to enterprise AI deployment at scale - something we've seen confirmed across bank after bank in our own work. Fragmented infrastructure can't provide the unified context and clear decision authority agents require. Without a coordinating control plane sitting above the disconnected parts, each new agent added to the stack compounds the problem rather than solving it.

Pilot purgatory is an infrastructure failure not an ambition gap

Banks aren't stuck in pilot purgatory because they lack AI talent. They're stuck because a pilot works inside a narrow, bounded environment that can simulate unified context. Scale that same agent across customers, employees, and product lines simultaneously, and the absence of a coordination layer becomes fatal. What looked like a working AI solution was a workaround operating in controlled conditions.

The structural problem is specific. Around 50% of frontline banking work lives in the whitespace between systems - handoffs, exceptions, and manual coordination that no single system owns. That's precisely where AI opportunities and operational risk are both highest. When a pilot doesn't touch that whitespace, it succeeds. When scaled AI inevitably does touch it, there's no authoritative source to read from or write back to, and the result isn't slow automation - it's accelerated disorder.

Fixing this means treating AI agents as a distinct actor class requiring explicit authorization. Every agent needs defined authority, clear limits, and a governance layer that answers compliance questions before they become incidents. That coordination infrastructure - sitting above the core, connecting customers, employees, and agents - is the missing ingredient. Without it, scaling AI doesn't compound value, it compounds the problems banks already can't control. Understanding what AI-native banking requires structurally is the starting point for getting this right.

The structural fix: a control plane that governs customers, employees, and AI agents as one system

The six challenges covered in this post share a common root: fragmented infrastructure. Banks deploying AI agents on disconnected systems don't get automation, they get chaos at higher speed. Each agent operates on partial data, follows inconsistent rules, and writes results back to different systems. No audit trail is coherent. No policy is consistently enforced. Governance becomes structurally impossible, not just difficult.

The architectural answer is a control plane that sits above the core and coordinates execution across customers, employees, and AI agents as a single system. Backbase calls this Banking OS. It introduces AI agents as a third class of actor alongside humans - and that distinction matters. Every agent must have explicitly authorized decision scope, a defined source of truth to read from and write back to, and unified customer context at the point of action. Without those properties built into the infrastructure, governance is an audit afterthought rather than an engineering property.

This is not a core replacement. Banks don't need to rip out existing systems to fix the coordination problem. They need a layer above those systems - one where every actor, human or AI, reads from and writes back to the same record, under the same rules. That layer is what makes compliance auditable, what makes AI decisions explainable, and what converts fragmented pilots into scaled operations that risk and compliance functions can approve. BCG research confirms that banks achieving AI scale build coordinating architecture first, before expanding agent scope.

Banks that treat AI implementation as a sequenced checklist - fix data, then hire talent, then add governance - will keep running into the same wall. Banks that deploy a coordination control plane first find that governance, auditability, and scaled automation arrive together rather than in conflict.

Frequently asked questions

What is the most common reason AI implementations fail in banks?

The most common reason is fragmented frontline infrastructure, not data quality or talent gaps. When AI agents lack unified customer context, consistent policy authority, and a shared source of truth, they cannot coordinate decisions across systems. Pilots succeed in narrow conditions but collapse when scaled across real customer operations.

Why can't banks just deploy AI agents on top of their existing core banking systems?

Legacy cores were built before real-time processing existed, so layering agents on top accelerates the architectural mismatch rather than fixing it. Each agent pulls from different data sources, applies inconsistent rules, and writes back to separate systems. The result is ungoverned execution at higher speed, not automation.

How does regulatory uncertainty specifically block AI from scaling inside a bank?

As Jouk Pleiter warns, the risk and compliance function paralyzes innovation when agents lack auditable authority. Regulators cannot examine decisions they cannot trace. When an agent draws context from one system and writes outcomes to another, no complete chain of authority exists, making compliance structurally impossible rather than just difficult.

What is a Banking OS and how does it differ from replacing a legacy core?

A Banking OS is a control plane that sits above the existing core rather than replacing it. It coordinates execution across customers, employees, and AI agents using a consistent source of truth and defined authority boundaries. Banks keep existing infrastructure while gaining the coordination layer that makes governed AI deployment possible.

How do banks move AI projects out of pilot purgatory and into production at scale?

Banks escape pilot purgatory by building coordination infrastructure before expanding agent scope. A pilot succeeds because it operates in bounded conditions that simulate unified context. Scaling requires a control plane that gives every agent explicit authorization, defined limits, and a single source of truth to read from and write back to.

6 AI implementation failures our 120+ bank deployments keep exposing