The Project | About Us | Contribute | Donations | License

HOME

In the previous lesson we saw what an event is and who produces or consumes it. Now we go one level down: how does that message physically travel from one service to another reliably when the two do not know each other and may be starting up, going down, or overloaded at different times? The answer is asynchronous messaging through message brokers. A broker is an intermediate piece of infrastructure that receives messages from producers, stores them durably, and delivers them to consumers when the latter are ready.

Mastering asynchronous messaging is essential because the reliability and consistency of any distributed system depend on it. In this lesson we will study the two basic primitives (queues and topics), delivery guarantees, the problem of duplicates and how to solve it with idempotency, and we will compare the three most widely used technologies on the market: RabbitMQ, Apache Kafka, and Amazon SQS.

Why asynchronous messaging?
Queues (point-to-point) vs Topics (publish/subscribe)
Delivery guarantees: at-most-once, at-least-once, exactly-once
The problem of duplicates and idempotency
Comparison: RabbitMQ vs Kafka vs Amazon SQS
Practical example: producer and idempotent consumer
Common mistakes and tips
Exercises and solutions
Conclusion

Why asynchronous messaging?

In a synchronous call, if service B is down, A's call fails. With asynchronous messaging, A deposits the message in the broker and keeps working; when B recovers, it will consume it. This provides:

Temporal decoupling: producer and consumer do not need to be alive at the same time.
Peak buffering: if 10,000 messages arrive at once, the broker holds them and the consumer processes them at its own pace.
Resilience: a slow or down consumer does not bring down the producer.

Queues (point-to-point) vs Topics (publish/subscribe)

There are two message distribution models.

Queue (point-to-point)

A message in a queue is processed by a single consumer. If there are several consumers connected to the same queue, the broker distributes (balances) the messages among them, but each message goes to only one. It is used to distribute work (work queue).

Topic (publish/subscribe)

A message published to a topic is delivered to all subscribers. It is used to broadcast events to multiple interested parties.

flowchart TB
    subgraph Queue["QUEUE (point-to-point)"]
        P1[Producer] --> Q[(Queue)]
        Q --> CA[Consumer A]
        Q --> CB[Consumer B]
        Q -. each message to ONLY one .-> CA
    end
    subgraph Topic["TOPIC (pub/sub)"]
        P2[Producer] --> T{{Topic}}
        T --> SA[Subscriber A]
        T --> SB[Subscriber B]
        T -. each message to ALL .-> SA
    end

Aspect	Queue (point-to-point)	Topic (pub/sub)
Receivers per message	One	All subscribers
Goal	Distribute the workload	Broadcast events
Typical pattern	Commands / tasks	Domain events
Example	Email-sending queue	"OrderCreated" to inventory, billing, shipping

Delivery guarantees: at-most-once, at-least-once, exactly-once

When a message travels over the network, acknowledgments (acks) can be lost, processes can crash, etc. That is why different levels of guarantee exist:

Guarantee	Meaning	Risk	Cost
At-most-once	Delivered 0 or 1 times	A message may be lost	Minimal (no retries)
At-least-once	Delivered 1 or more times	May be duplicated	Medium (requires retries and acks)
Exactly-once	Delivered exactly 1 time	None (ideal)	High (complex, not always real)

Key points:

At-most-once: the consumer acknowledges before processing. If it fails, the message is lost. Only valid if losing some data is acceptable (e.g., non-critical metrics).
At-least-once: the consumer acknowledges after processing successfully. If it fails just before the ack, the message is redelivered → there may be duplicates. It is the most common and recommended level.
Exactly-once: semantically perfect, but costly. In pure distributed systems it is very hard to guarantee end to end; it is usually achieved by combining at-least-once with idempotency in the consumer (we will see this next). Kafka offers "exactly-once" within its own boundaries (transactions), but as soon as the effect leaves Kafka (writing to another DB, calling an API), it again depends on idempotency.

Real-world golden rule: design for at-least-once and make your consumers idempotent. In practice, that gives you the effect of exactly-once.

The problem of duplicates and idempotency

An operation is idempotent if executing it several times produces the same result as executing it once. If our consumers are idempotent, at-least-once duplicates stop being a problem.

Common techniques to achieve idempotency:

Idempotency key / message ID: record the IDs already processed and discard repeats.
Naturally idempotent operations: UPDATE balance SET value = 100 (absolute assignment) is idempotent; UPDATE balance SET value = value + 100 (increment) is not.
UPSERT with a unique key: insert or ignore if it already exists.

-- Table that records which messages we have already processed.
CREATE TABLE processed_messages (
    message_id   VARCHAR(64) PRIMARY KEY,  -- idempotency key
    processed_at TIMESTAMP NOT NULL
);

-- Upon receiving a message, we try to insert its ID.
-- If it already exists (duplicate primary key), we know it is a duplicate.
INSERT INTO processed_messages (message_id, processed_at)
VALUES ('msg-abc-123', CURRENT_TIMESTAMP)
ON CONFLICT (message_id) DO NOTHING;  -- PostgreSQL: ignore if it already exists

Explanation:

message_id is the primary key: the database guarantees there are no two identical ones.
ON CONFLICT ... DO NOTHING makes the second attempt to insert the same ID produce neither an error nor an effect. This way we detect the duplicate without complex additional logic.
If the INSERT affected 0 rows, it was a duplicate and we can skip processing.

Comparison: RabbitMQ vs Kafka vs Amazon SQS

Characteristic	RabbitMQ	Apache Kafka	Amazon SQS
Main model	Queue broker (AMQP)	Distributed event log	Managed queue (cloud)
Paradigm	Queues + exchanges (pub/sub)	Partitioned topics + offsets	Queues (Standard and FIFO)
Message retention	Until consumed (deleted)	Configurable (days/weeks), replayable	Up to 14 days
Replay	Not native	Yes (re-read from an offset)	No
Ordering	Per queue	Per partition	FIFO only in FIFO queues
Throughput	High	Very high (millions/s)	High (auto-scaling)
Typical guarantee	At-least-once	At-least-once / exactly-once*	At-least-once (Std) / exactly-once (FIFO)
Management	Self-managed / cloud	Self-managed / managed	Fully managed (AWS)
Ideal case	Complex routing, RPC, tasks	Streaming, event sourcing, big data	Simple decoupling in AWS without operating infra

Practical summary:

RabbitMQ: excellent when you need flexible routing (exchanges with rules) and traditional work-queue patterns.
Kafka: the choice for high volume, retention, and replay of events; the foundation of event sourcing and streaming (lesson 05-05).
SQS: the simplest option if you are already on AWS and just want to decouple without managing servers.

Practical example: producer and idempotent consumer

Let's look at a Kafka consumer in Spring that applies at-least-once + idempotency.

@Component
public class PaymentConsumer {

    private final IdempotencyRepository idempotency;
    private final AccountingService accounting;

    public PaymentConsumer(IdempotencyRepository idempotency,
                           AccountingService accounting) {
        this.idempotency = idempotency;
        this.accounting = accounting;
    }

    @KafkaListener(topics = "payments.confirmed", groupId = "accounting")
    public void consume(PaymentConfirmedEvent event, Acknowledgment ack) {
        // 1. Have we already processed this message? -> idempotency
        if (!idempotency.registerIfNew(event.paymentId())) {
            ack.acknowledge(); // duplicate: we acknowledge and exit
            return;
        }
        // 2. Actual business logic
        accounting.postEntry(event);
        // 3. We acknowledge ONLY after processing successfully (at-least-once)
        ack.acknowledge();
    }
}

Detailed explanation:

@KafkaListener subscribes the method to the topic payments.confirmed. The groupId "accounting" identifies this consumer group; Kafka distributes the partitions among the group members.
registerIfNew(...) attempts to insert the ID (as in the SQL above). It returns false if it already existed → it is a duplicate, we acknowledge and exit without reprocessing.
The acknowledgment (ack.acknowledge()) is done after postEntry. If the process dies before the ack, Kafka will redeliver the message (at-least-once), but idempotency will prevent the double entry.

# Spring Kafka configuration for manual acknowledgment (key to at-least-once)
spring:
  kafka:
    consumer:
      group-id: accounting
      enable-auto-commit: false      # do NOT acknowledge automatically
      auto-offset-reset: earliest    # read from the beginning if there is no offset
    listener:
      ack-mode: manual               # we acknowledge ourselves with ack.acknowledge()

enable-auto-commit: false is essential: if Kafka acknowledged on its own, it could acknowledge before we finished processing, and we would lose messages upon a failure.
ack-mode: manual delegates to our code the exact moment of acknowledgment.

Common Mistakes and Tips

Acknowledging before processing. It turns your at-least-once into an accidental at-most-once and you lose messages upon failures. Always acknowledge at the end.
Believing that "exactly-once" eliminates the need for idempotency. As soon as the effect crosses the broker's boundary (another DB, an external API), you need idempotency all the same.
Not sizing the dead-letter queue (DLQ). Messages that fail repeatedly must go to a discard queue so they do not block the main queue in an infinite loop of retries.
Using a topic when you wanted a queue (or vice versa). Broadcasting a command to "everyone" can execute the same action N times.
Tip: always define a unique message_id field in your events from day one; adding it later is painful.

Exercises

A team processes payments with a consumer that acknowledges the message right upon receiving it and then contacts the banking gateway. If the process dies between the ack and the banking call, what real guarantee do they have and what problem occurs? How would you fix it?
Indicate for each case whether you would use a queue or a topic: (a) send the welcome email exactly once; (b) notify inventory, billing, and CRM of a new order; (c) distribute 1,000 PDF-generation tasks among 5 workers.
Write an idempotent SQL statement to "mark an order as paid" that can be executed several times without side effects.

Solutions

They have at-most-once: if it dies after the ack, the message is not redelivered and the payment never reaches the gateway (it is lost). Fix: acknowledge after the banking call (at-least-once) and make the operation idempotent using the payment's idempotency key so as not to charge twice upon a retry.
(a) Queue (a single receiver, a single time). (b) Topic (three subscribers receive the event). (c) Queue (work distribution among workers).
For example:

-- Absolute assignment of the state: idempotent.
UPDATE orders
SET status = 'PAID', paid_at = COALESCE(paid_at, CURRENT_TIMESTAMP)
WHERE order_id = :orderId AND status <> 'PAID';

Running it again changes nothing because the condition status <> 'PAID' is no longer met and paid_at is preserved with COALESCE.

Conclusion

Asynchronous messaging is the circulatory system of event-driven architectures. We distinguished queues (one receiver, load distribution) from topics (all subscribers, broadcast). We understood the three delivery guarantees and why at-least-once + idempotency is the pragmatic combination the industry uses. Finally, we compared RabbitMQ, Kafka, and SQS to know when to choose each one.

In the next lesson, "Event Patterns: Event Sourcing and CQRS", we will see how, instead of storing only the current state, we can store the complete sequence of events as the source of truth, and how to separate the read and write models to scale.

Asynchronous Messaging: Queues and Brokers

Contents

Why asynchronous messaging?

Queues (point-to-point) vs Topics (publish/subscribe)

Queue (point-to-point)

Topic (publish/subscribe)

Delivery guarantees: at-most-once, at-least-once, exactly-once

The problem of duplicates and idempotency

Comparison: RabbitMQ vs Kafka vs Amazon SQS

Practical example: producer and idempotent consumer

Common Mistakes and Tips

Exercises

Solutions

Conclusion

Application Architecture Course

Module 1: Fundamentals of Application Architecture

Module 2: Design Principles and Tactics

Module 3: Architectural Styles and Patterns

Module 4: Distributed Architectures and Microservices

Module 5: Event-Driven Architectures and Messaging

Module 6: Domain-Driven Design (DDD)

Module 7: Data and Persistence

Module 8: Cloud Architecture and Deployment

Module 9: Quality, Security and Observability

Module 10: Evolution, Governance and Case Studies