Graze Turbostream
Overview & Motivation
Turbostream is a real-time, hydrated repeater service built on top of Bluesky’s Jetstream. It bridges the gap between raw event feeds (which contain URI and DID references) and enriched, context-aware records you can consume directly from a WebSocket. By hydrating stale references—user profiles, mentions, parent/root posts, and quoted records—Turbostream delivers everything you need to understand each event without additional API calls.
Key motivations:
- Reduce boilerplate: Clients don’t need to manage their own caching or multiple API calls.
- Low latency: Hydration happens in bulk (up to 100 records at once) and streams immediately.
- Context-rich feeds: Every record includes full profile info, reply chains, and embedded content.
WebSocket Usage
Turbostream exposes a single WebSocket endpoint. You can consume it with tools like websocat:
websocat "wss://api.graze.social/app/api/v1/turbostream/turbostream"
Or from Node/Python clients:
`const ws = new WebSocket("wss://.../turbostream/turbostream");
ws.onmessage = ({ data }) => console.log(JSON.parse(data));
`
Each message is a JSON array of enriched records:
`[
{
"at_uri": "at://did:plc:.../app.bsky.feed.post/3lng7e4mr5c2z",
"did": "did:plc:...",
"time_us": 1745342977546824,
"message": { /* raw jetstream record */ },
"hydrated_metadata": { /* see below */ }
},
...
]
`
Hydration Process & Pipeline
- Batch collection (up to 100 records): extract all unique DIDs and URIs (including mention facets, reply parent/root URIs, and quote embed URIs).
- Cache check: read-lock to filter out already-cached entities.
- Bulk fetch missing profiles (
get_user_data_for_dids) and posts (hydrate_records_for_uris) via Bluesky API clients, in parallel. - Cache write: writer-lock to update both user and post caches with fresh data.
- Enrichment: assemble each record’s
hydrated_metadataby merging raw event data with cached profiles and posts.
All caches use an LRU strategy, ensuring memory is bounded and recently accessed items stay hot.
What We Hydrate
user: full profile details for the posting DID (handle, avatar, bio, follower counts, etc.).mentions: map of mentioned DIDs to their profile objects, parsed from richtext facets.parent_post: the immediate parent in a reply thread, if present (postView structure).reply_post: the root of the thread, if different from the parent.quote_post: any embedded record that was quoted viaapp.bsky.embed.recordembeds.
Each of these fields is null if not applicable or unavailable.
Example Enriched Record
``
`{
"at_uri": "at://did:plc:qipvtyo27owt4isjah3j3dw2/app.bsky.feed.post/3lng7e4mr5c2z",
"did": "did:plc:qipvtyo27owt4isjah3j3dw2",
"time_us": 1745342977546824,
"message": { /* raw jetstream payload */ },
"hydrated_metadata": {
"user": { /* profileViewDetailed */ },
"mentions": {},
"parent_post": { /* postViewBasic */ },
"reply_post": { /* postView */ },
"quote_post": null
}
}`
Client Integration Tips
- Backpressure: Monitor Redis Stream lag or WebSocket send buffer.
- Reconnect logic: Implement exponential backoff on disconnects.
- Cache your own layer: If you re-hydrate further fields, maintain a small in-memory map.
- Throttling: Adjust batch sizes or cache TTLs based on traffic patterns.
With Turbostream, consuming a fully enriched Bluesky jetstream has never been easier—simply connect and dive into the context, not the plumbing.
Updated on: 30/10/2025
Thank you!