IBM, the Mainframe and Stream Processing

März 29, 2026

Mainframes Are Growing

Less than two weeks ago, IBM completed (including a few lay-offs) the acquisition of Confluent — the company that already a couple of years ago stated that their “final and ultimate vision is to replace the mainframe with new applications using modern and less costly technologies. Stand up to the dinosaur, [...]” (https://www.confluent.io/online-talks/offloading-and-replacement-with-apache-kafka/).

IBM, on the other hand, is the company accounting for more than 90% of installed mainframe systems. In 2024, mainframes were still used by 71% of Fortune 500 companies. They handle 90% of all credit card transactions. 4 out of the top 5 airline companies use them. And handle 68% of the world’s production IT workloads (https://www.rocketsoftware.com/en-us/insights/mainframe-turns-60).

You might be, as I have been until today, fooled to think that the mainframe market is decreasing. In fact, surprise, it is growing! In Q4/2025, IBM’s mainframe business recorded its best fourth quarter revenue in more than 20 years. Revenue soared 61% year over year, adjusted for currency, driving a 17% increase in the infrastructure segment (https://www.fool.com/investing/2026/01/31/this-outdated-ibm-technology-just-did-something-it/). For FY 2025, IBM made about $15B (of close to $70B overall) in their “infrastructure” segment, mainly driven by their mainframe business.

Now, what about streaming and stream processing. We know that Confluent — the by far biggest player in the space — made an impressive $1.17B revenue last year. But even if we add the other, far smaller vendors, IBM is still making about 10x more money with mainframes than all vendors in the streaming space combined.

If you are a streaming engineer working in the financial sector, or at an airline, when you think of it, all of these numbers are not that surprising at all. In most of these industries, streaming is only build around those mainframe systems. Often only for specific use cases such as fraud detection, or for lakehouse ingestion. The “final and ultimate vision” of Confluent to actually replace mainframes still seems lightyears away.

But why? The standard answers are cultural: Nobody wants to touch legacy systems, organisational inertia. All true. But there’s a technical reason that nobody talks about openly, and it’s even more fundamental than any of these.

Stream processing has never guaranteed that its outputs are correct.

Not “correct under failure conditions.” Not “correct eventually.” Correct at any given point in time.

“Consistency” vs. “Correctness”

When you talk about “consistency” in the database world, you mean, in broad strokes, “correctness”: At any point in time, the state of the database reflects a valid transition from correct state to another correct state. If you debit account A and credit account B in a single transaction, any observer who reads the database at any time sees a correct result.

When you talk about “consistency” in classical stream processing systems like Flink or Kafka Streams, you most often talk about “exactly-once semantics”: Each input event is processed exactly once, with no duplicates or losses. But that’s a delivery guarantee — it doesn’t guarantee you anything about whether the outputs at any given moment are correct!

In 2021 already, Jamie Brandon wrote a blog post called “Internal Consistency in Streaming Systems.” He showed, with a simple credits-and-debits example, that Flink and Kafka Streams produce loads of outputs that are actually incorrect (https://www.scattered-thoughts.net/writing/internal-consistency-in-streaming-systems/).

When I read Jamie’s blog post in early 2023, my perspective on stream processing turned upside down. And I was hooked. In our O’Reilly book “Streaming Databases”, Hubert Dulay and dedicated an entire chapter to Jamie’s discovery (https://www.oreilly.com/library/view/streaming-databases/9781098154820/).

But inside the streaming community, Jamie’s blog post from 2021 and our book from 2024 barely made a ripple.

A Fundamental Design Decision

A fundamental decision in the initial architectural design of a stream processing engine is whether to:

1. correctness first, then performance and scalability

2. performance and scalability first, correctness as in “it depends on your use case”

The database world choose option 1, and then spent decades optimizing query execution while maintaining the invariant that outputs are always correct.

Stream processing made the opposite choice. And now we wonder why the mainframes are still humming along happily.

Inconsistency in Action

How does this look like in practice? Take joins. Every time a new event arrives on one side of a join, Flink computes a new intermediate result. That result is emitted downstream — to Kafka topics, to state stores, to sink connectors, to target systems. Then another event arrives, and another intermediate result is emitted downstream. And another. And another.

Flink’s own documentation is honest about the nature and the amount of intermediate results, and does offer a number of techniques for mitigation, such as MiniBatch Aggregation and the MultiJoin operator. But alas, as the url reveals, this is about fine-tuning, not correctness (https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/table/tuning/).

Now imagine you have a Flink stream processing topology including multiple joins. Each join operator is a strictly independent unit. When you chain four joins together, you are not just adding intermediate results — you are multiplying them! In a recent blog post by the Zalando (a German online retailer), all these intermediate results brought them to a state store size of about 240GB of data for each of their applications. Luckily, they did have a superstar Flink engineering team to fine-tune — not many companies in this world can afford this (https://engineering.zalando.com/posts/2026/03/why-we-ditched-flink-table-api-joins-cutting-state.html).

In any case, if you use a classical stream processing tool such as Flink, you are bound to create an incredible amount of useless intermediate results. And because all those intermediate results need to be processed, you need more Flink parallelism. More task managers. More memory. More disk. More network. More everything.

And, this is for the CTOs and CFOs, how much money are we throwing out of the window? Not only Flink, but also all the downstream systems have to handle 10x or more the amount of messages that they would have to handle if the underlying architecture would have been “correctness first”. We pay 10x or more all the way down into the target systems. For compute. For memory. For network. And for highly skilled Flink experts who can fine-tune to somehow still make it work.

Let alone the thousands of use cases where stream processing would make sense in principle but not in practice, either due to missing talent or just because it would be too expensive in the first place.

DataBase Stream Processing

In 2022, a paper came out of VMware Research called “DBSP (DataBase Stream Processing): Automatic Incremental View Maintenance for Rich Query Languages” (Mihai Budiu et al.) It described a formal framework for incremental computation over streams that maintains the database invariant: outputs are always correct. DBSP was inspired by Frank McSherry’s (Materialize) work on Differential Dataflow (already published in 2013!) (https://www.cidrdb.org/cidr2013/Papers/CIDR13_Paper111.pdf).

For DBSP, like Differential Dataflow and the database world, correctness came first, then performance and scalability. It took the opposite route compared to classical stream processing engines.

Feldera implemented this in Rust. It’s correct, it’s fast. For their customers, it seems to be a huge money saver (https://www.feldera.com/blog/how-feldera-customers-slash-cloud-spend).

But still, as exciting as it sounds, it yet barely made a ripple, at least in streaming circles.

Make Stream Processing Great Again

I’d really like to find out whether I am not just hallucinating. Why are 99% of the stream processing engineers so convinced of the Flink and Kafka Streams that they do not even consider to look at the likes of Materialize or Feldera? Why hasn’t Jamie’s blog post from 2021 caused a storm? Why hasn’t Hubert’s and my O’Reilly book, especially chapter 6?

Maybe Materialize and Feldera are too opaque. Maybe if we had a transparent and easy-to-use DBSP-based stream processing engine that everyone could just use, not only my vantage point on stream processing would turn upside down.

So I came to the conclusion to build a DBSP-based Kafka Streams for Python. It is still work in (very slow) progress and called Kafi Streams.

Kafi Streams is based on a minimal pure Python implementation of DBSP called PyDBSP (https://github.com/brurucy/pydbsp) by Bruno Rucy. It adds a Kafka Streams-inspired syntax for building stream processing topologies, garbage collection and checkpointing (either on Kafka or Disk/S3/Azure Blob Storage). I’ll be presenting the first alpha at Berlin Buzzwords this June. (https://2026.berlinbuzzwords.de/session/kafi-streams-complex-stream-processing-made-simple/).

With Kafi Streams, I’d like to prove that correct stream processing is possible, for everyone. And how many benefits it could give us. Imagine what can happen if you combine a correct stream processing engine in your Python code with all the incredible Python libraries for data science and analytics and do your computation in real-time rather than batch. You can see some of this magic already in Bruno’s blog post where he shows how to combine stream processing (on your GPU!) with Pandas (https://www.feldera.com/blog/gpu-stream-dbsp).

And maybe — If my proof works and I can help make stream processing great again, Confluent’s final and ultimate vision for stream processing to start replacing the mainframes can really become a little more tangible — and you can start thinking about a conversation with your COBOL team about whether the mainframe really needs to keep running forever.

Substack von Ralph

Diskussion über diese Post

Sind Sie bereit für mehr?