Skip to content

Streams

A FlowMQ Stream is a persistent, immutable, append-only log of messages. It is designed for high-throughput event streaming and provides a durable, replayable history of data. Unlike a queue, where a message is typically consumed once and then deleted, messages in a stream are kept for a configured period, allowing multiple independent consumers to read and re-read the data from any point in time.

This model is ideal for use cases like event sourcing, real-time data pipelines, audit logging, and broadcasting messages to multiple services.

The Log-Based Model

At its core, a stream is simply a log file to which messages are only ever appended. This simple but powerful design has several key properties:

  • Append-Only: New messages are always written to the end of the log.
  • Immutable: Once written, a message at a specific position can never be altered or deleted individually.
  • Offsets: Each message in the stream is assigned a unique, sequential number called an offset. Offsets start at 0 and increment for each new message. They serve as the "address" or cursor for a message within the stream.
  • Retention: Messages are not removed when they are read. Instead, they are retained based on a policy (e.g., for 7 days, or until the stream reaches 10 GB in size). This allows for historical replay of messages.

Producing to a Stream

In FlowMQ, you do not publish messages directly to a stream. Instead, a producer sends messages to a Topic, and FlowMQ routes copies of these messages to the appropriate streams.

This routing is configured by creating a binding between a stream and a topic filter. A stream can be bound to one or more topic filters, and any message published to a topic that matches the filter is automatically and durably appended to the stream's log.

This design provides immense flexibility:

  • Decoupling: Producers only need to know about topics, not which streams are consuming the data.
  • Flexibility: You can add or change stream consumers for different purposes (e.g., add a new real-time analytics stream) without ever modifying your original publisher application.

Consuming from a Stream

The power of the stream model becomes apparent in how data is consumed.

  • Consumer Position (Offset Committing): Each consumer (or consumer group) is responsible for tracking the offset of the last message it has successfully processed. This is known as committing the offset. By storing its offset, a consumer can disconnect and later resume reading exactly where it left off.
  • Independent Consumption: As shown in the diagram, multiple consumer groups can read from the same stream at different speeds and for different purposes. The "Analytics" group is fully caught up, while the "Audit" group is processing historical data.
  • Start Position: When a new consumer connects for the first time, it can choose where to start reading from:
    • Earliest: Start from the very beginning of the stream (offset 0).
    • Latest: Start from the next message that will be produced, ignoring historical messages.
    • A specific, previously stored offset.

When to Use a Stream vs. a Queue

Choose a Stream when:

  • You need to broadcast the same message to multiple different services.
  • You need the ability to replay historical data.
  • You are implementing event sourcing and need a durable log of all state changes.
  • You are building real-time data pipelines to feed analytics systems, data lakes, or search indexes.

Choose a Queue when you are distributing tasks among a pool of identical workers and want each task processed only once.