Real-time data pipelines | DNA Solutions

Key takeaways

Streaming is not faster batch. When each event is revenue, the correctness bar is higher than batch ever required, and that is where the cost lives.
Exactly-once is a property of the whole pipeline, not a checkbox in your broker. At-least-once delivery plus idempotent processing keyed on a stable event ID is what actually holds in production.
You must be able to prove the numbers. A streaming billing pipeline without a queryable audit trail and replay capability will fail its first revenue assurance audit.
Most reporting use cases do not need streaming. If nobody acts on the data within seconds, you are paying operational cost for latency that buys nothing.

Real-time data pipelines get sold as "faster batch", and for analytics dashboards that framing is mostly harmless. In billing, tolling, and payments it is dangerous. When every event is a unit of revenue, the correctness bar is not the same bar at lower latency; it is a higher bar, because a dropped or double-counted event is money lost or money you cannot defend to an auditor. This is what changes when you build real-time data pipelines on streams where every event is money, and where the patterns earn their complexity instead of just adding it.

Streaming is not faster batch

The first mistake is treating streaming data integration as a batch job with a shorter window. Batch gives you a property that is easy to undervalue: a bounded dataset with a clear start and end. You read yesterday's toll events, compute, reconcile against a known total, and rerun if something is wrong. The dataset does not move while you look at it.

Streaming removes that boundary. The data is unbounded and arrives continuously, out of order, with no "the day is done" marker. Every guarantee batch gave you for free now has to be engineered: when is a window closed, what counts as a duplicate, what happens to an event four hours late, how do you prove the total when there is no total. For a dashboard nobody bills against, fine. For a telecom billing platform or a toll operator, every one of those questions maps to revenue. That is the real cost of streaming, and the reason to be honest about when it earns it.

When streaming actually earns its complexity

Streaming earns its complexity when someone or something acts on the event within seconds and the action has consequences. Fraud scoring before a transaction settles. Real-time tolling where the vehicle is already past the gantry. Usage-based billing where the customer expects a live balance. Here latency is part of the product, and batch cannot deliver it.

It does not earn its complexity for end-of-month invoicing, quarterly regulatory reports, hourly BI dashboards, or reconciliation runs finance reviews the next morning. If nobody acts within seconds, you pay the full operational cost of a streaming system, plus a 24/7 on-call rotation, to buy latency nobody uses. The honest default is batch unless a concrete actor needs the event in near real time, with the strongest architectures hybrid: a streaming path for time-critical decisions, a batch path that owns the authoritative billing run and reconciliation.

Exactly-once is a property of the whole pipeline

The phrase "exactly-once" sells streaming platforms and confuses engineers. No single setting makes a distributed pipeline deliver every event exactly once end to end. What the infrastructure gives you is at-least-once delivery; what you build on top is idempotent processing. The combination behaves like exactly-once where it counts: in the numbers you bill.

Concretely, every event needs a stable, business-meaningful idempotency key assigned at the source: an OBU event ID, a transaction ID, a usage record ID. Not a key generated downstream, which changes on replay. Processing has to be idempotent against that key, so reprocessing produces the same result and never a second charge. The sink has to cooperate: upsert keyed on the idempotency key, or a deduplication table, not a blind append that double-counts on retry.

Frameworks help. Kafka transactions, Flink checkpointing, and Spark Structured Streaming idempotent sinks give stronger guarantees inside their boundaries. But the guarantee stops there. The moment your pipeline crosses into an external ERP, a payment processor, or a partner channel, exactly-once is back in your hands and idempotency keyed on something stable is the only thing that saves you. Verify it with tests that kill and restart consumers mid-stream, not as a feature you switched on.

Ordering, late events, and watermarks

Events do not arrive in the order they happened. A vehicle's toll events reach the platform out of sequence because of device buffering, a partner relay that batches, or a gantry that replays its backlog after an outage. Billing logic that assumes arrival order equals event order will misprice.

The discipline is to separate event time (when it happened, stamped at the source) from processing time (when you saw it), and drive all billing logic off event time. Stream processing engines implement this with watermarks: a moving assertion that you have probably seen all events up to a given event time, which lets you close a window. The watermark is a bet on how late events can be. Too tight and you close windows before stragglers arrive, dropping revenue; too loose and you hold state for hours, paying in memory and latency.

Late events are routine in tolling and telecom, so you need an explicit policy: a grace period for known-typical lateness, and a defined path for events arriving after the window closed. Do not drop them silently. Route them to a correction stream that adjusts the emitted result and writes an auditable delta, so the late event becomes a documented adjustment, not a number you cannot explain.

Reconciliation and the audit trail

This is what separates a billing pipeline from an analytics pipeline: you must prove the numbers. A European toll operator's revenue is audited, and the auditor does not accept "the stream processor computed it". They want the chain of evidence from raw event to invoiced amount, reconstructable for any record.

That requirement shapes the architecture. The raw event stream has to be retained, not consumed and discarded, because it is the source of truth you reconcile against. Every transformation between raw event and billed line has to be reproducible: reprocess a time range and get the same numbers, or a documented difference. Reconciliation becomes a continuous control, not a year-end scramble: events ingested versus billed, usage in versus invoiced, with any gap raised as an alert rather than found in an audit. A streaming billing pipeline without a queryable audit trail and replay will fail its first revenue assurance audit. Build the audit trail as a first-class output, not a logging afterthought you hope is complete.

Backpressure, failure recovery, and observability

Streams have peaks: a toll network at rush hour, a telecom platform when meters roll at the top of the hour, a payment system on a sale day. When ingestion outruns processing, an unmanaged pipeline either drops events (revenue gone) or runs out of memory and falls over. Backpressure lets the slow part tell the fast part to wait, and a durable broker like Kafka absorbing the buffer makes that survivable: the events queue, processing catches up, nothing is lost. Recovery follows from the same retained stream: a dead consumer restarts from its committed offset and reprocesses, and because processing is idempotent, that is safe rather than double-billing. Durable retention plus idempotency turns a crash into a non-event.

Observability is not optional here because the failure mode is silent. A pipeline can be "up", every service green, while consumer lag grows or a deduplication table quietly drops writes, and revenue leaks the whole time. The metrics that matter are revenue metrics in disguise: consumer lag per partition, end-to-end latency, events ingested versus processed versus billed, and reconciliation deltas. Alert on the gaps between those counts, because a gap is money you have not accounted for yet.

Where DNA Solutions fits

DNA Solutions builds real-time data pipelines for billing and tolling platforms where every event is revenue, on Kafka, Flink, and Spark Structured Streaming, with the idempotency, reconciliation, and replay patterns that survive a revenue assurance audit. We are equally willing to tell you when streaming is the wrong tool and batch would serve at a fraction of the cost. If you are designing or repairing a pipeline where dropped or double-counted events have a euro value, schedule a technical discussion to review your architecture.

Related services: Data & Analytics, Billing & Charging

Industry: Toll & Road Infrastructure, Telecom & Media

Real-time data pipelines when every event is money

Streaming is not faster batch

When streaming actually earns its complexity

Exactly-once is a property of the whole pipeline

Ordering, late events, and watermarks

Reconciliation and the audit trail

Backpressure, failure recovery, and observability

Where DNA Solutions fits

Related articles

Event-driven architecture for transaction-critical systems: where it helps, where it bites

Data lake architecture: building one that does not become a swamp

AI culture debt: why your teams aren't ready for AI (and how to fix it before it's too late)

Let's find out if we're a fit.