I’m key noting at Prefect Summit 2024 [$]. My talk, (tentatively) titled 5 infrastructure trends in 20 minutes, is a whirlwind tour the things I’ve been writing about on Materialized View. Register here to check it out! The conference is virtual, totally free, and includes talks from Prefect, Block, and Cox Automotive.
Deterministic simulation testing (DST) is a hot topic. With DST, Developers constrain their software so that it can be executed deterministically (the same inputs always produce the same outputs). Next, developers run multiple iterations of their tests with different inputs—usually randomized—to find unexpected behavior. When a failure occurs, developers can re-run the tests with the failed input to reproduce and debug the failure. Resonate’s blog has a good description:
Deterministic simulation testing repeatedly executes an application in a simulated environment under changing initial conditions, monitoring that the correctness constraints are maintained across executions.
Traditionally, developers have had to write their software deterministically to do DST. This is a tall order; random numbers, clocks, network latency, hardware faults, and process scheduling must all be managed by the developers. I wrote at length about this in the a post last November:
Still, companies such as Dropbox, Resonate, TigerBeetle [$], Polar Signals, FoundationDB, and others have deemed the benefit of DST to outweigh the burden of implementing it.
Then Antithesis came out of stealth and completely upended the space. Antithesis built a hypervisor that gives them complete control of a container’s execution. In effect, the hypervisor makes an entire container deterministic; the code inside doesn’t have to be modified since all system calls are managed deterministically. Antithesis’s design is ground breaking. Developers can now take any containerized code and run it deterministically on Antithesis. DST for the masses.
That’s the promise at least. In practice, Antithesis is still getting off the ground. There are a surprising amount of companies using Antithesis, but I’ve heard more than one complaint about its cost and integration complexity. These two issues are likely related. Integration is hard so Antithesis has to hand hold its customers. Hand holding is expensive so the bill goes up. I’m told quotes can reach ~$200,000. There’s no doubt in my mind that their product is worth well north of $200,000 for a large company selling infrastructure to hundreds of customers. For early stage startups, it’s a tough sell.
I fully expect Antithesis’s price to come down as integration is simplified. And they’ve already announced an open source giveaway program to alleviate some of the problem.
In the meantime, developers are searching for alternatives. Utkarsh Srivastava is hacking on a deterministic hypervisor for QEMU; Polar Signals modified Go’s runtime for non-determinism; and Meta has a (maintenance-mode) project called “Hermit”, which forces deterministic execution in a container. Building a non-deterministic hypervisor is just too juicy of a problem. There will be open source implementations (Heck, Hermit already exists!).
Unrelated to any of this, there was some drama last week around WarpStream’s [$] Benthos fork following Redpanda’s purchase, “Redpanda Connect” rebranding, and some license changes. Jay Kreps, CEO and co-founder of Confluent, posted a detailed thread that included this tweet:
Jay’s post got me thinking about Antithesis. I think they should open source their hypervisor.
Yes, it’s a novel innovation, but it seems inevitable that there will be open source implementations. Most of Antithesis’s value isn’t in their hypervisor anyway. The tools to mock other systems (such as AWS), the “Software Explorer” that tests different code paths, the log aggregation and debugging features; these are the most valuable parts of Antithesis. And open sourcing their hypervisor could accelerate its development, thereby dropping the integration complexity. This is something Hermit struggled with:
There is a long tail of unsupported system calls that may cause your program to fail while running under Hermit.
Shortly after my post, Antithesis confirmed what I’d been hoping:
This is exactly right! I’m not surprised that they haven’t yet open sourced the hypervisor; there’s still alpha to squeeze from it. Running an open source project is no small task either. But I’m very excited to hear it’s on their minds, and they’re planning to open up their hypervisor.