Browser Bankruptcy 2023: Consensus, Durable Execution, Streaming, HTAP, TSDBs, and more...
Everything in my tabs as I close out the year. Catch you in '24!
I started this blog on October 31, 2023. It’ll be 2 months old as we start the new year. It’s been fun to write and I plan to keep it going in the new year. If you haven’t yet read my most popular post, it’s a good place to start:
Durable Execution: Justifying the Bubble
There’s been a surge in durable execution frameworks over the past 6 to 12 months. Temporal has been the go-to for a while but many new projects and companies are emerging. Let’s look at why, and what needs to change. Durable execution explained Temporal’s
Rather than a recap or predictions post, I thought it’d be fun to share what I’ve got in my browser tabs—stuff I haven’t been able to get to yet. I hope you find a link or two that pique your interest.
Consensus
Viewstamped Replication sucked me back into consensus protocols this year.
- Nezha: Deployable and High-Performance Consensus Using Synchronized Clocks 
- Building a Large-scale Distributed Storage System Based on Raft 
Workflows, FaaS, durable execution
Durable execution blew up this year.
- Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads 
- Lifting the veil on Meta’s microservice architecture: Analyses of topology and request workflows 
Streaming
The trend toward S3 persistence for streaming (with WarpStream [$]) captured my interest.
- Clonos: Consistent Causal Recovery for Highly-Available Streaming Dataflows 
- Streaming from Apache Iceberg - Building Low-Latency and Cost-Effective Data Pipelines 
- DBSP: Automatic Incremental View Maintenance for Rich Query Languages 
HTAP/multi-model databases
As I dug more into S3 persistence, I found plenty of other exemplary systems (e.g. Neon, Turbopuffer, Quickwit). I’ve been thinking lately about what it means to have all our data directly on S3. Does it make hybrid transaction/analytical processing (HTAP) and multi-model databases easier to build or more likely to be successful?
- Running OLAP and OLTP Workloads on the Same Cluster with Workload Prioritization 
- Introducing Compute-Compute Separation for Real-Time Analytics 
- The Beauty of HTAP: Defining a Modern Data Architecture with TiDB 
Embedded databases
Litestream, LiteFS, libsql, Turso, and SKDB have me pulling at the embedded (and edge) DB thread.
- Building data-centric apps with a reactive relational database 
- Embedded databases (1): The harmony of DuckDB, KùzuDB and LanceDB 
- TreeLine: An Update-In-Place Key-Value Store for Modern Storage 
Time-series databases
Some Prometheus (and frostdb) spelunking led me to InfluxDB’s new(ish) IOx storage engine, which uses Datafusion and Parquet.
PosgreSQL
PostgreSQL and its extensions continue to be a pragmatic solution to, well, everything.
- Introducing pgroll: zero-downtime, reversible, schema migrations for Postgres 
- The Great Re-shard: adding Postgres capacity (again) with zero downtime 
Analytics
Analytics Twitter continues to be very—obnoxiously—loud.
- Why we develop on data locally and how to finally stop (Part 1) 
- Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics 
- Don’t Hold My Data Hostage – A Case For Client Protocol Redesign 
You can support me by purchasing The Missing README: A Guide for the New Software Engineer for yourself or gifting it to new software engineers that you know.
I occasionally invest in infrastructure startups. Companies that I’ve invested in are marked with a [$] in this newsletter. See my LinkedIn profile for a complete list.



