SQLite on Bluesky, Litestream, Nile Launch, Buffer Pool Refresher, and more...
Blue Sky's PDS is moving to SQLite, I rabbit hole on mmap and buffer pools, Nile is released, and I highlight Tigris.
SQLite for Bluesky’s PDS Server
Bryan Newbold posted an interesting update on Blue Sky’s personal data server (PDS). A PDS is the thing that hosts your posts and some other stuff (more details here). They’re moving from a monolithic PostgreSQL instance to a distributed implementation. Jake Gold (also at Blue Sky) posted some more notes:
See Jake’s posts for the logic behind this change. SQLite is showing up more and more in distributed systems. I think this trend will continue. Blue Sky’s yet more evidence.
Poking at Litestream
The Litestream comment in Jake’s thread, above, caught my eye. I’m on the lookout for more examples of Litestream in the wild. Fly.io is clearly a big one. Ben Johnson, the Litestream author, works there. Their post, Introducing LiteFS, got me thinking.
LiteFS works by interposing a very thin virtual filesystem between your app and your on-disk database file. It’s not a file system like ext4, but rather a pass-through. Think of it as a file system proxy. What that proxy does is track SQLite databases to spot transactions and then LiteFS copies out those transactions to be shipped to replicas.
I’m hearing more and more good stuff about Fly.io.
Ben Johnson’s Why I Built Litestream is also a great read. It made me nostalgic for the good ‘ole days. A quote I feel compelled to pull from the post:
This works particularly well for SaaS applications where each customer is isolated from one another.
Timely considering Nile’s announcement.
Nile Launch
Nile [$] launched a new serverless Postgres-compatible DB built for SaaS.
I’m really excited about this, and I’m not alone. Most people who’ve worked on SaaS immediately get why this database is a big deal. As a concrete example, some of my friends are currently using Rails’s apartment library to bolt multi-tenancy on top of their SaaS application. With Nile, all of this moves to the database layer. Pretty slick.
I’ve known Sriram (Ram) and Gwen for a while. Sriram sat a few aisles over from me at LinkedIn when he started Ambry (which really should get more attention). I first met Gwen when she was at Cloudera.
Sriram pitched Nile to me a few years ago over coffee. At the time, Chrix Finne and I were thinking about metered billing for SaaS payments. Ram had a very similar outlook on the space. Since then, they’ve iterated their way to a serverless SaaS database.
Buffer Pool Refresher
Simon Eskildsen (of Napkin Math and Turbopuffer fame) shared a link to a really interesting paper and presentation on mmap in databases.
The paper and video list four problems:
Transactional safety
I/O stalls
Error handling
Performance issues
A comment on buffer pools in the video sent me down a rabbit hole. If you’re like me, and want to refresh your memory on buffer pools (see what I did there?), start here or this video:
I also found MySQL’s InnoDB buffer pool algorithm page, which describes an actual implementation. It uses LRU eviction, but inserts pages into the middle of the pool in a specific way.
Project Highlight: Tigris
I’m talking with Ovais Tariq (of Tigris) about serverless infrastructure in a few weeks. Ovais is the CEO of Tigris Data, which builds an open source serverless NoSQL DB.
Tigris is based on Uber’s Docstore. 1
Docstore is a general-purpose multi-model database that provides a strict serializability consistency model on a partition level and can scale horizontally to serve high volume workloads. Features such as Transaction, Materialized View, Associations, and Change Data Capture combined with modeling flexibility and rich query support, significantly improve developer productivity, and reduce the time to market for new applications at Uber.
There’s a lot going on with Tigris, but some here are some highlights:
Built on Foundation DB (I plan to do a paper highlight on Foundation soon.)
MongoDB-compatible protocol (and gRPC/HTTP)
It’s open source! (Apache 2)
Sadly, it seems Tigris is missing change data capture (CDC)—something Docstore had. I’d be happy to be corrected if I’m wrong on this.
awesome-infra Repo
A friend asked me this week for a list of recent infrastructure projects I’m tracking. I decided to post the list to Twitter, too.
The tweet was a hit so I created awesome-infra. I’m going to track cool software infrastructure projects there. I could use your help curating the list and getting the categories right. You’re welcome to submit your own projects and startups as long as they satisfy the CONTRIBUTING.md and PR template.
I occasionally invest in infrastructure startups. Companies that I’ve invested in are marked with a [$] in this newsletter. See my LinkedIn profile for a complete list.
If you really want to pull at this thread, the Docstore post links to previous posts on Uber’s Schemaless database, which precedes Docstore. They also tried Cassandra.