Event-driven architecture in practice | DNA Solutions

Key takeaways

Event-driven architecture buys you decoupling, scale, and an auditable replay log, which is exactly what a billing or tolling platform needs to reconstruct any charge on demand.
Most message brokers give you at-least-once delivery, not exactly-once. On a money flow, that means every consumer must be idempotent or you will double-bill.
The transactional outbox pattern is the reliable way to publish an event and commit a database change atomically. Dual-writes without it lose money silently.
Eventual consistency is the wrong default for the moment a payment is authorized or a balance is debited. Some operations need a synchronous request and response, and that is not a design failure.

Event-driven architecture is sold as the default for modern systems: decouple your services, scale them independently, replay history when you need it. On a content platform, a lost event is an annoyance. On a tolling or billing platform, every event is money, and the failure modes an EDA introduces (duplicate delivery, reordering, eventual consistency colliding with a balance that must be exact) become finance, audit, and trust problems. This article is about where event-driven architecture genuinely earns its place on transaction-critical systems, and where a synchronous call is simply the right answer.

What event-driven architecture actually buys you

Strip away the marketing and event-driven architecture gives you three things that matter on a billing or tolling platform.

Decoupling. The service that records a toll passage does not need to know about the services that generate invoices, push data to partners, or file regulatory reports. It emits an event, and each downstream consumer evolves on its own schedule. On a platform with a dozen integration points, that is the difference between a release that touches one service and one that touches all of them.

Independent scale. Toll events do not arrive at a constant rate. Morning peak produces bursts a synchronous pipeline would have to absorb in real time. With a message broker in the middle, the ingestion service writes at line rate and the billing consumers drain the backlog at their own pace.

Replay and audit. This one is undersold. If your event log is the source of truth, you can reconstruct any invoice from the raw events that produced it. When an auditor asks why a vehicle was charged a given amount on a given day, you replay the events. When a pricing bug is found, you reprocess the affected window rather than patching rows in finance by hand. Where revenue assurance is externally audited, that matters.

Those benefits are real. The rest of this article is the bill.

At-least-once is the default, and it will double-bill you

The broker you are using almost certainly does not give you exactly-once delivery, no matter what the homepage says. Kafka, RabbitMQ, SQS, and the rest are at-least-once in practice. A consumer reads a message, processes it, and crashes before committing the offset; on restart it reads the same message again. The exactly-once features that exist are narrow: they apply within a single broker transaction, and they evaporate the moment your processing reaches an external system (a payment gateway, an ERP, a partner API).

On a content feed, a message processed twice means a duplicate log line. On a billing system it means a customer charged twice, a partner invoiced twice, or a balance debited twice: a finance and trust incident.

The only durable answer is to stop chasing exactly-once delivery and make every consumer idempotent: processing the same event twice produces the same result as once. In practice each consumer carries a deduplication key (the event id, or a business key like passage-id plus pricing-version) and records which keys it has applied, inside the same transaction that applies the effect. A second delivery sees the key and acknowledges without re-applying. On a money flow this is a precondition for production, not a later optimization.

The transactional outbox: publish and commit atomically

The second trap is the dual-write. A service updates its database (the toll passage is now billed) and then publishes an event (downstream, generate the invoice). Two systems, two writes, no shared transaction, and a window where the process can die between them. Commit the database and crash before publishing: the passage is billed but no invoice is generated. Publish first and crash before committing: an invoice goes out for a passage your database does not consider billed. Both are silent, and both surface weeks later as a reconciliation gap.

The transactional outbox pattern removes the window. Instead of publishing directly to the broker, the service writes the event into an outbox table in the same database transaction as the business change. The two writes commit or roll back together, so there is no partial state. A separate relay then reads the outbox and publishes to the broker, marking each row as sent.

The relay can crash before marking a row and republish on restart, which is fine because your consumers are already idempotent. The outbox guarantees the event is never lost; idempotency guarantees a re-sent event does no harm. You want both or neither.

Ordering and eventual consistency versus a balance that must be exact

Event-driven systems reorder. Partition a topic for throughput and two events for one customer can be processed out of order: a credit lands before the debit it offsets, a correction before the charge it corrects.

For many domains, eventual consistency is acceptable: the system converges and a few seconds of skew harm nobody. For the moment money changes hands, it frequently is not. If a fleet account has a hard credit limit, "the balance will be correct eventually" is not good enough at the instant you authorize the next charge.

This is the honest core of the trade-off. Event-driven patterns buy throughput and decoupling with weaker ordering and consistency guarantees, while money flows often need strong guarantees at a specific point. You reconcile the two by being deliberate about where that point sits:

Use ordering keys so all events for one account land on the same partition and are processed in order, even if the topic as a whole is unordered.
Keep the authoritative balance in one place with proper transactional semantics. The event stream feeds that place and tells other services about it; it is not where the balance is computed by racing consumers.
Make the consistency boundary explicit, so everyone knows which reads may be stale (a dashboard) and which must be current (an authorization).

When the right answer is a synchronous call

The strongest signal of a mature event-driven design is knowing where not to use one.

Some operations need an answer now, and it must be authoritative. "Can this card be charged for this amount?" and "Is this account within its credit limit?" are requests and responses. Modeling them as fire-and-forget events and waiting for a result event rebuilds synchronous request and response out of asynchronous parts, with more moving pieces and worse latency. That is a distributed call wearing a costume, not decoupling.

A clean rule of thumb: events for facts that have already happened, synchronous calls for decisions that must happen now. A toll passage occurred: a fact, publish it and let the world react. A payment must be authorized before goods are released: a decision with a required answer, make the call and handle the timeout.

The same rule explains why you invest in observability before scaling an EDA. A synchronous failure is a stack trace on one request. An asynchronous failure is an invoice that never arrived, three services from the event that caused it. You need distributed tracing that follows a correlation id across every hop, and dead-letter handling that captures events a consumer cannot process so a human can inspect and replay them. A dropped event is lost revenue; a poison message that blocks a partition is a billing outage.

How DNA Solutions approaches this

We build and operate event-driven billing and tolling platforms where the event log is the audit trail and every consumer is idempotent by construction, because we have seen what at-least-once delivery does to a money flow that assumed otherwise. The work is not adopting event-driven architecture or rejecting it, it is drawing the line between the parts that benefit from decoupling and replay and the parts that need a synchronous, authoritative answer. If you are designing a transaction-critical platform, or untangling one where eventual consistency has started colliding with finance, schedule a technical discussion to review your event flows and consistency boundaries.

Related services: System Integration, Billing & Charging

Industry: Toll & Road Infrastructure, Telecom & Media

Event-driven architecture for transaction-critical systems: where it helps, where it bites

What event-driven architecture actually buys you

At-least-once is the default, and it will double-bill you

The transactional outbox: publish and commit atomically

Ordering and eventual consistency versus a balance that must be exact

When the right answer is a synchronous call

How DNA Solutions approaches this

Related articles

The enterprise integration patterns that actually survive in production

Keycloak for enterprise IAM: when self-hosted identity is the right call

When AI understands your customers better than your team does

Let's find out if we're a fit.