Hardware Clocks on AWS, SIMD Explored, Distributed Transactions Without XA, and more...

AWS Time Sync gets microsecond accuracy and hardware clocks; I discover AWS Nitro; A brief overview of SIMD; and the Epoxy paper explores distributed transactions without XA.

Nov 20, 2023

AWS Time Sync in Microseconds

Amazon Web Services added microsecond-level accuracy to Time Sync Service last week.

Today, we announced that we improved the Amazon Time Sync Service to microsecond-level clock accuracy on supported Amazon EC2 instances.

Time Sync now runs with double-digit microsecond accuracy. Amazon, somewhat cryptically, explains why you should care:

These clocks can be used to more easily order application events, measure 1-way network latency, increase distributed application transaction speed, and incorporate in-region and cross-region scalability features while also simultaneously simplifying technical designs.

Reasoning about time globally is really helpful for distributed systems. I first came across the idea of a global GPS (and atomic) clock in Spanner: Google’s Globally-Distributed Database and its concept of TrueTime.

TrueTime is a highly available, distributed clock that is provided to applications on all Google servers. TrueTime enables applications to generate monotonically increasing timestamps: an application can compute a timestamp T that is guaranteed to be greater than any timestamp T' if T' finished being generated before T started being generated.

You automatically get improved accuracy if you’re using NTP with Time Sync. AWS’s public NTP server (time.aws.com) also received the update. (Aside: Here’s a list of NTP servers from Google, Facebook, Cloudflare, Apple, and more.)

You’ll need to use a precision time protocol (PTP) client to get microsecond granularity. For those unfamiliar, Meta introduces PTP in How Precision Time Protocol is being deployed at Meta. AWS’s post demonstrates instance configuration with its PTP Hardware Clock (PHC).

Time Sync on Nitro

Time Sync’s improved precision is made possible by AWS Nitro.

The new PHC device is part of the AWS Nitro System, so it is directly accessible on supported bare metal and virtualized Amazon EC2 instances without using any customer resources.

Nitro provides AWS building blocks to assemble different hardware profiles easily. One such building block is a hardware clock card. Nitro has been around for years, but I had no idea (I’m team GCP).

My reaction to Nitro was, “So, like, ISA, VLB, and PCI slots?” Conceptually, yes. Concretely, no. There’s a heavy focus on security and isolation. Learn more here:

SIMD Explainer

I spent time this week learning about Single instruction, multiple data (SIMD). Rust’s Beginner’s Guide to SIMD is a good introduction.

SIMD stands for Single Instruction, Multiple Data. In other words, SIMD is when the CPU performs a single action on more than one logical piece of data at the same time. Instead of adding two registers that each contain one f32 value and getting an f32 as the result, you might add two registers that each contain f32x4 (128 bits of data) and then you get an f32x4 as the output.

Here’s what SIMD looks like in practice (n.b. Brandeis’s CS146a example is from 2015):

// simd.rs
#![feature(core)]

use std::simd::f32x4;

fn main() {
    // create simd vectors
    let x = f32x4(1.0, 2.0, 3.0, 4.0);
    let y = f32x4(4.0, 3.0, 2.0, 1.0);

    // simd product
    let z = x * y;

    // like any struct, the simd vector can be destructured using `let`
    let f32x4(a, b, c, d) = z;

    println!("{:?}", (a, b, c, d));
}

Why is SIMD a big deal? Matrix operations are useful—something I wish I knew when taking Linear Algebra in college—particularly for vector search and running LLMs on CPUs. Elastic Search talks about their usage in Accelerating vector search with SIMD instructions:

At the heart of Lucene's vector search implementation lie three low-level primitives used when finding the similarity between two vectors: dot product, square, and cosine distance.

Video games and DSP have benefited from SIMD for years.

…nearly every modern video game console since 1998 has incorporated a SIMD processor somewhere in its architecture

There are all kinds of other applications, too. Common array operations like sorting, filtering, and searching are all optimizable with SIMD.

Paper Highlight: Epoxy

I had the chance to talk to Peter Kraft and Qian Li earlier this year. At the time, we spoke about Apiary, a transactional function-as-a-service (FaaS) platform. Since then, they’ve published a VLDB ‘23 paper, Epoxy: ACID Transactions Across Diverse Data Stores.

Epoxy is a protocol for providing transactions across heterogeneous data stores. The protocol solves the same problem as X/Open XA with two improvements:

Epoxy does not require data stores to implement any protocol, itself.
Unlike X/Open XA, Epoxy provides transactional isolation, not just atomicity.

Epoxy requires a “shim” in front of each data store. The shims manage transactional isolation and work with a transaction coordinator to ensure atomic commits.

Murat Demirbas write a good blog post on the transaction protocol.

Epoxy isn’t without tradeoffs, though. All systems interacting with an Epoxy’d table must do so through the Epoxy shim. At least one system—the coordinator—must provide transactions; this is usually an RDBMS like PostgreSQL or MySQL. Transactions also require some client code:

def reserve (hotelId, customerData):
  ctxt = epoxy.beginTransaction()

  # Check room availability in Postgres.
  res = pg.query("SELECT avail FROM Hotels WHERE hotel = hotelId")

  if res == 0:
    epoxy.commitTransaction (ctxt)
    return false # No room available.

  # Update availability in Postgres.
  pg.update("UPDATE Hotels SET avail = res −1 WHERE hotel = hotelId")

  # Make a reservation in MongoDB.
  epoxy.update(
    context = ctxt,
    secondary = mongo,
    key = hotelId,
    record = customerData)

  epoxy.commitTransaction(ctxt)
  return true

Client code like this is pretty standard, but it’s still something to be aware of.

Peter and Qian have founded a startup, DBOS. They are working on Operon, “A Typescript framework built on databases.” Operon’s API looks interseting; it’s got a lot of decorators.

More Awesome Infrastructure

Keep up with new projects as they’re added to the awesome-infra Github repo. New projects and startups welcome! See CONTRIBUTING.md and the PR template.

AutoMQ - Cloud native implementations of Kafka and RocketMQ.
Kafka - An open-source distributed event streaming platform.

I occasionally invest in infrastructure startups. Companies that I’ve invested in are marked with a [$] in this newsletter. See my LinkedIn profile for a complete list.