Buffer is a library with two main components: multiple producers and a single consumer. Producers and the consumer communicate exclusively through a queue manifest in object storage — there is no direct network path between them and no stateful broker to operate.Documentation Index
Fetch the complete documentation index at: https://opendata.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
Producers
Producers accept arbitrary byte entries from callers, buffer them in memory, and periodically flush them as binary data batches to object storage. Each producer exposes a minimal API:WriteHandle returned by produce() contains a DurabilityWatcher.
Callers can await_durable() to block until the entries have been
persisted to object storage and enqueued in the manifest. This avoids data
loss during failover, but it does not provide read-your-own-writes
consistency — durability refers to the data batch, not to the database
write.
Consumer
The consumer is the read-side counterpart to the producers. It iterates over data batches in object storage and delivers them to the database in the order recorded in the queue:next_batch() reads the next location in the queue manifest,
fetches the data batch from object storage, and returns the data,
metadata, and a unique sequence number to the caller. Sequence numbers
are assigned by the queue producer at ingestion time in increasing order
without gaps, and they are used both for acknowledgement and for tracking
progress. Acknowledgements must be in order: acknowledging a sequence
number that is not immediately after the last one is rejected, ensuring
that no batch is silently skipped.
While multiple producers can write to a queue manifest, only one consumer
can read from it. This matches OpenData’s single-writer paradigm and
prevents zombie consumers — surviving after an infrastructure failure —
from acknowledging batches and making them invisible to the active
consumer. Single-consumer exclusivity is enforced by an epoch stored in
the queue manifest: when a consumer starts, it reads the epoch,
increments it, and writes it back. Reads with a stale epoch are rejected,
so any zombie consumer is fenced as soon as a new one takes over.
The consumer is also responsible for cleaning up processed data batches.
At startup and every 100 acknowledged batches, all entries for
acknowledged batches are dequeued from the manifest. A dedicated garbage
collector task then deletes the corresponding data batches from object
storage.
The consumer API supports exactly-once delivery. To achieve it, the
writer atomically persists the batch data and the sequence number from
next_batch() to the database. If the atomic write succeeds but a failure
occurs before ack() returns, a new writer can construct a new consumer
with that last persisted sequence number so it resumes right after it. The
new consumer also bumps the epoch, fencing the old one and preventing any
duplicate writes.