A company moved its core ledger to AWS. It never finished on time again. The compute was faster. The clock speed was higher. The reason it failed is physics.
A company moved a core ledger batch run to AWS.
It never finished on time again.
The compute was faster. The clock speed was higher. The infrastructure was modern. The batch window still broke.
The reason is physics.
A batch window is the time between the close of business and the start of the next business day during which the system can process everything that happened. Settlement. Reconciliation. Posting. Downstream file generation. Interest calculations. Regulatory reporting. All of it runs in sequence, each step dependent on the previous one completing correctly.
The window is not generous. In financial services, the batch window is typically four to six hours. Miss it and the next business day starts with yesterday’s data. In banking, that is not a technical inconvenience. It is a regulatory incident.
The mainframe was designed around this constraint. Not adapted to it after the fact. Designed for it from the beginning.
IBM’s System/360, introduced in 1964, was built with a channel architecture that separated I/O processing from CPU processing. The CPU did not wait for data. Data moved through dedicated channels at memory-bus speeds while the processor kept working.
Six decades of refinement later, the z17 – IBM’s current generation mainframe, now generally available – processes billions of financial transactions daily at memory-bus speeds, with AI inferencing built directly into the processor so it runs in line with the transaction rather than on a separate network hop.
This is not incremental improvement. It is sixty years of compounding the same fundamental insight: at scale, the speed of data access determines everything.
One disclosure: I am not an IBM employee. IBM does not pay me. I have been building mainframe software independently since 2004 and have no financial relationship with IBM. What I have is 35 years of watching the physics win.
The company that moved its core ledger to AWS did not make an engineering mistake. They made a physics mistake.
The re-platformed COBOL ran on a managed runtime. The data layer was RDS. The same workload that ran in four hours on a zSeries ran six to eight hours on a good night on AWS.
The compute was faster by raw clock speed. The problem was not the compute.
A standard AWS same-region hop between EC2 and RDS runs one to four milliseconds. That sounds like nothing.
And that is on a good night.
The first answer anyone offers is caching. Put the data in memory. Eliminate the network hop. Problem solved.
It is the wrong answer.
Settlement batch is not read-heavy OLTP. It is sequential, stateful, and write-committed at every step. You cannot cache a double-entry posting run because you do not know what the next record looks like until you have finished processing the current one.
Each step produces output that becomes the input to the next step. The computation is inherently sequential. There is no read pattern to cache because the read pattern depends on the write that just happened.
This is not a limitation of the implementation. It is the nature of the workload. Double-entry bookkeeping is sequential by definition. It has been sequential since Luca Pacioli described it in 1494.
The mainframe does not cache this workload. It processes it at memory-bus speeds because the channel architecture moves data without waiting for the network. AWS does not have a channel architecture. It has a network. The batch window breaks.
The migration business case typically compares CPU cost per core, memory cost per gigabyte, storage cost per terabyte, and license cost per instance. These comparisons are legitimate. For many workloads, cloud wins on all four.
The comparison that does not appear in most migration business cases: data access latency at scale, sequential write throughput under committed transaction semantics, memory-bus speed versus network round-trip time.
These comparisons are harder to model. They require understanding what the workload actually does, not just what it costs.
The Gartner analyst who observed VMware refugees finding mainframe cheaper than Broadcom alternatives was not making a sentimental argument. He was making a workload argument. For specific workloads, at specific scales, with specific access patterns, the physics of the mainframe is simply better. The cost follows the physics.
IBM did not achieve the highest IBM Z revenue in twenty years by selling nostalgia.
The z17, powered by the Telum II processor, processes billions of financial transactions daily at memory-bus speeds with AI inferencing built directly on chip – running in line with the transaction, not on a network hop. The record launch IBM’s CFO described was not a surprise to anyone who understood the workloads.
Organizations are not buying the z17 because they are afraid to change. They are buying it because they ran the numbers, modeled the workloads, and discovered that for settlement, reconciliation, and core banking batch processing, nothing else closes the window.
The channel architecture that IBM started building in 1964 is still the reason the batch window exists at all on modern systems. The z17 did not replace it. It refined it again – added AI inferencing at memory-bus speeds, added quantum-safe cryptography, added the Spyre accelerator for generative AI workloads.
The platform evolved. The physics did not change.
Not every workload belongs on mainframe. Not every workload belongs on AWS. The question is never “which platform is better” in the abstract. The question is which platform is better for this workload, at this scale, with these access patterns.
For customer-facing APIs, microservices, content delivery, and variable workloads – cloud is often the right answer.
For core ledger settlement, end-of-day reconciliation, sequential stateful batch processing at millions of transactions – the physics of the mainframe is difficult to match at competitive cost.
The companies that sleep peacefully are the ones who understood this distinction before they signed the migration contract, not after the batch window missed for the third consecutive night.
The company that moved its core ledger to AWS did not fail because they chose the wrong vendor. They failed because they modeled the cost and not the physics.
The batch window is not a legacy constraint you migrate away from. It is a physics contract between your workload and your platform.
Move the workload and you move the contract. The window does not follow you. The physics stays behind.
The z17 is generally available. The channel architecture is sixty years old. IBM Z revenue is the highest in twenty years. The physics was always going to win.
Also worth reading: People Can Sleep Peacefully With Mainframe · So You Decided Mainframe Is Cheaper Than Broadcom. Now What? · Why AI Pilots Succeed and Productions Fail