Articles on: API

Graze Turbostream

Overview & Motivation


Turbostream is a real-time, hydrated repeater service built on top of Bluesky’s Jetstream. It bridges the gap between raw event feeds (which contain URI and DID references) and enriched, context-aware records you can consume directly from a WebSocket. By hydrating stale references—user profiles, mentions, parent/root posts, and quoted records—Turbostream delivers everything you need to understand each event without additional API calls.


Key motivations:


  • Reduce boilerplate: Clients don’t need to manage their own caching or multiple API calls.
  • Low latency: Hydration happens in bulk (up to 100 records at once) and streams immediately.
  • Context-rich feeds: Every record includes full profile info, reply chains, and embedded content.


WebSocket Usage


Turbostream exposes a single WebSocket endpoint. You can consume it with tools like websocat:


websocat "wss://api.graze.social/app/api/v1/turbostream/turbostream"



Or from Node/Python clients:


`const ws = new WebSocket("wss://.../turbostream/turbostream");

ws.onmessage = ({ data }) => console.log(JSON.parse(data));

`


Each message is a JSON array of enriched records:


`[

{

"at_uri": "at://did:plc:.../app.bsky.feed.post/3lng7e4mr5c2z",

"did": "did:plc:...",

"time_us": 1745342977546824,

"message": { /* raw jetstream record */ },

"hydrated_metadata": { /* see below */ }

},

...

]

`


Hydration Process & Pipeline


  1. Batch collection (up to 100 records): extract all unique DIDs and URIs (including mention facets, reply parent/root URIs, and quote embed URIs).
  2. Cache check: read-lock to filter out already-cached entities.
  3. Bulk fetch missing profiles (get_user_data_for_dids) and posts (hydrate_records_for_uris) via Bluesky API clients, in parallel.
  4. Cache write: writer-lock to update both user and post caches with fresh data.
  5. Enrichment: assemble each record’s hydrated_metadata by merging raw event data with cached profiles and posts.


All caches use an LRU strategy, ensuring memory is bounded and recently accessed items stay hot.


What We Hydrate


  • user: full profile details for the posting DID (handle, avatar, bio, follower counts, etc.).
  • mentions: map of mentioned DIDs to their profile objects, parsed from richtext facets.
  • parent_post: the immediate parent in a reply thread, if present (postView structure).
  • reply_post: the root of the thread, if different from the parent.
  • quote_post: any embedded record that was quoted via app.bsky.embed.record embeds.


Each of these fields is null if not applicable or unavailable.


Example Enriched Record


``


`{

"at_uri": "at://did:plc:qipvtyo27owt4isjah3j3dw2/app.bsky.feed.post/3lng7e4mr5c2z",

"did": "did:plc:qipvtyo27owt4isjah3j3dw2",

"time_us": 1745342977546824,

"message": { /* raw jetstream payload */ },

"hydrated_metadata": {

"user": { /* profileViewDetailed */ },

"mentions": {},

"parent_post": { /* postViewBasic */ },

"reply_post": { /* postView */ },

"quote_post": null

}

}`



Client Integration Tips


  • Backpressure: Monitor Redis Stream lag or WebSocket send buffer.
  • Reconnect logic: Implement exponential backoff on disconnects.
  • Cache your own layer: If you re-hydrate further fields, maintain a small in-memory map.
  • Throttling: Adjust batch sizes or cache TTLs based on traffic patterns.


With Turbostream, consuming a fully enriched Bluesky jetstream has never been easier—simply connect and dive into the context, not the plumbing.

Updated on: 26/09/2025

Was this article helpful?

Share your feedback

Cancel

Thank you!