Reading Data

Reading data from OpenData databases heavily leverages data caches to avoid unnecessary GET API calls to object storage. Reading data relies on a combination of domain-specific logic as well as shared storage access primitives. Each database implements its own indexes and data layouts for optimal read performance, but all use SlateDB queries to access the data through multiple levels of caches.

Read Freshness

Read freshness can be measured in terms of object storage round trips between a write being accepted and it being visible to a query:

Write Path	Read Target	S3 Round Trips	Description
Direct write to writer	Writer	0	Data is written to the in-memory Delta and is immediately readable on the writer with no object storage interaction.
Zonal ingest	Writer	1	Data is written to object storage by a stateless ingestor and must be picked up by the writer before it is readable.
Any	Read replica	+1	Read replicas discover new data by watching object storage for manifest updates, adding one additional round trip on top of whatever the write path requires.

Each round trip typically adds around 100ms of latency plus the polling interval, which is configurable. In practice, the polling interval dominates the freshness delay.

Concepts

Timeseries

Log

Vector

Key-Value

Read Freshness

Concepts

Timeseries

Log

Vector

Key-Value

​Read Freshness

Read Freshness