Microsecond transforms: Building a fast sandbox for user code

Anthony Accomazzo

@accomazzo

•

Apr 17, 2025

•

13 min read

Often, platforms let users supply code to customize behavior. But sandboxing user code for safe execution is a complex challenge.

At Sequin, we needed a solution for our new transforms feature. This feature allows users to write functions that transform and route messages streaming from their database. This feature required a sandboxing solution that balanced three critical requirements:

Performance: With potential loads of tens of thousands of transactions per second, execution speed directly impacts our infrastructure costs and scalability. At 10ms per execution, processing 50k transactions/second would require 500 cores dedicated just to transformations!
Security: The entry-level tier of our cloud solution is multi-tenant. So we need strong protection against malicious code that might attempt to gain privileged access or consume unbounded resources in that environment.
User experience: We wanted users to create and edit functions directly in our web console without dealing with build pipelines or complex deployment processes.

Beyond these technical requirements, we face practical constraints as a startup. This is a new feature, so we just need a good enough starting point.

This post explores our journey through various sandboxing approaches, from cloud functions to custom interpreters. I'll share our decision-making process as we searched for the optimal balance of performance, security, and usability.

Sandboxing approaches overview

Before diving into the specific solutions we explored, it helps to understand the general categories of approaches available for sandboxing user code:

Categories of solutions

VM-based sandboxes: These solutions run code in isolated virtual environments, including cloud functions (AWS Lambda), containers (Docker), and lightweight VMs (Firecracker, WASM).
Embedded languages: Languages specifically designed to be embedded within other applications, with built-in safety mechanisms (Lua, Starlark).
Custom interpreters: Creating restricted subsets of existing languages by limiting available functions, operators, and constructs. (Elixir has some prior art for this.)

Evaluation criteria

We evaluated each solution against four key criteria:

Performance: How quickly could we execute transformations?
Runtime complexity: How much operational overhead would the solution introduce? Both in our cloud or when developers run Sequin on their own?
Implementation complexity: How difficult would it be for our small team to build and maintain?
Safety complexity: How easy is it for us to make the solution safe?
User experience: How easy would it be for our users to write and debug their code?

Summary of findings

Solution Category	Performance	Implementation complexity	Runtime complexity	Safety complexity	UX
VM-based	Good to great	High	Medium to high	Low	Good to great
Embedded languages	Great	Medium	Low to medium	Medium	Mixed
Custom interpreter	Excellent	Medium	Low	High	Great

Each approach offered distinct advantages and challenges, which we'll explore in detail throughout this post. Our evaluation process considered not just the technical merits of each solution, but also how well it aligned with our team's expertise and resources.

VM-based solutions

Virtual machine-based solutions are perhaps the most widely-used approach to sandboxing. A VM offers robust isolation by design.

While this isolation provides strong security guarantees, it comes with trade-offs in complexity and performance.

Cloud functions

Cloud functions were the first thing we considered. We could use a service like AWS Lambda to host user code. This has a lot of benefits:

Support for a broad range of languages
Very safe and secure
No noisy neighbor issues

But it’s complex:

We now have to manage function lifecycle in another service (e.g. AWS).
To support running Sequin in multiple clouds, we'd have to add cloud-specific adapters (e.g. GCP Cloud Run functions).
And what about when running Sequin locally or in CI? We'd presumably need a non-cloud solution as well.

For performance, we have to make a network hop for every execution. A warm lambda can take 1-10ms to execute if in the same region.

A 10ms execution would normally be unacceptable. But cloud functions are unique in that we can run them in hyper-parallel (executions are not tying up one of our CPU cores). Because we're using Elixir, the I/O to the cloud function rounds to ~free.

Docker/Firecracker

We could run user code inside a VM. Lambda uses Firecracker. Docker runs with a strong isolation level also.

Friend of Sequin, @abc3erl, spiked a great solution here. He demonstrated that you can have an Elixir GenServer boot up a Docker container on boot, opening a TCP connection with it. Our application could then interface with the GenServer, which would forward requests and responses over the port.

The bench shows a solid 100-150us runtime (un-optimized). This means ~6,600 invocations/sec, so just 8 cores to handle 50k database TPS.

We like this solution a lot. Running user code in an isolated VM gives us great guarantees.

But it comes at a cost: we have to manage the lifecycle of all these VMs. Every user would get their own VM with their code/runtime on it. On a multi-tenant instance, we'd need to be precise about lifecycle, otherwise we'd risk e.g. eating all the machine's memory.

And it might make Sequin harder to run in other environments. Would users run Sequin with a Docker sidecar for transforms in CI?

There are good solutions to these problems, but they feel like they'll take some time to get right.

Web Assembly

Web Assembly (WASM) is a portable bytecode format for executable programs. It is designed for running sandboxed code with safety/security guarantees.

WASM is similar to the VM solution. We can completely control the execution context of a WASM program: memory utilization, network/filesystem access, as well as the total amount of compute resources it can consume.

We'd use the great WasmEx library for Elixir to run WASM programs. We could boot a pool of WASM programs for each pipeline. With a warm pool, we'd round-robin transform requests in the pool.

Managing a pool of WASM programs felt a bit easier than a stable of Docker containers, as their lifecycle is managed by our BEAM VM. We also figured memory consumption would be lower than a Docker-based solution.

WASM programs execute relatively fast. We were clocking 1-3ms to execute transform functions in JavaScript compiled to WASM. We assumed Go and Rust programs would be faster.

However, this solution is complex. The build pipelines for each language are different. We'd either support one language (e.g. JavaScript) or need to offer tooling for our users to create and upload WASM bundles in their preferred language.

While we're familiar with Docker, running WASM programs requires its own set of knowledge. One example is how you pass data into and out of WASM programs: with Docker, we'd stream data back and forth over a tcp connection, something we have a lot of experience with. With a WASM program, you're passing memory pointers back and forth. Your external program writes e.g. parameters into the WASM program's memory space. Then calls the program with an integer pointer to that memory space. Then your WASM program similarly writes its output to memory and returns a pointer to that.

Given we have to manage the lifecycle of these WASM programs, the implementation complexity feels similar to Docker.

WASM + interpreter

WASM comes with a couple complexities. One is the build step, where we'd have to compile user code to a WASM binary. The other is the runtime step, where we now have to manage all these WASM programs in our system (one for each pipeline).

A way around that complexity is to instead just create a single WASM program. Instead of hard-coding a user transformation, that WASM program would instead accept the user's code + parameters and eval it.

This is a little less efficient, as the program you're eval'ing is not pre-compiled. But most of the overhead of running transforms in this way is likely from the serialization/deserialization of data to/from the WASM sandbox anyway.

I quite like this approach! You get the safety and security of the WASM sandbox, while simplifying implementation. I like the idea that we just need to ship a single binary with our application that can run all our transforms.

You just need to pick a language that can be efficiently evaluated on the fly by a WASM host language.

We experimented with using this approach to eval Starlark, using a Go interpreter. We implemented a small program that exported a function that took a Starlark program (string) and named parameters (JSON). Performance was good, on the order of 300-500us.

Deno

Deno is an alternative server-side runtime for JavaScript by the creator of Node.js. It was built from the ground up with sandboxing: you can run Deno instances that don't have access to the filesystem or network.

Deno the company also offers a hosted solution for running user code, comparable to AWS Lambda.

If you can pick any language to let users write code on your service, you're going to pick JavaScript.

We didn't spend a ton of time with this solution, but briefly explored it. We'd run a cluster of Deno VMs that could execute JavaScript functions.

Our performance testing showed execution times in the 200-500μs range for simple transforms.

However, running and managing these Deno instances introduces similar complexity to the Docker approach - we'd need to handle instance lifecycle, resource allocation, and keep warm pools of VMs for performance.

We could use Deno's hosted solution for our own cloud. But that means a fork in how Sequin runs locally vs in our cloud. At that point, might we just use cloud functions so we could support more languages?

Embedded language solutions

After evaluating VM-based approaches, we realized that while they offered excellent security, the implementation and operational complexity might be challenging for our small team.

This led us to explore a different category of solutions: embedded languages specifically designed for safe execution within host applications. These languages promised to reduce complexity while maintaining reasonable security and performance characteristics.

Starlark

Starlark is a subset language of Python. It was built by Google for the Bazel build pipeline.

Starlark is an embedded language which lacks functionality, like accessing file or network. While it has all the standard library functionality you need to manipulate strings and maps and arrays, it lacks things like recursion. This clips Starlark's power, making it hard to write hostile code that consumes lots of resources.

Meta maintains a Rust crate that evaluates Starlark code.

Elixir has a great library for interfacing with Rust code called Rustler. With a little effort, some vibes, and a lot of LLM support, we were able to complete a proof-of-concept for evaluating Starlark code from Elixir.

We could send a string of Starlark code and JSON-encoded parameters to the Rust program. The Rust program would eval the Starlark code, binding our parameters. And return a JSON-encoded response.

The Rust crate even makes it easy to define functions in Rust that are made available to the Starlark eval context. So, for example, we could write datetime helper functions that our users could use in their Starlark code. ¹

This solution is very promising: Python is popular. Starlark provides a lot of safety guarantees. And it's a much lighter-weight solution than running a VM.

We were seeing about 500us execution time for a pretty simple Starlark transformer. We assume a lot of that is serializing JSON to go back and forth.

The downsides were that our proof-of-concept was not ready for production. And getting it production-ready likely means pulling in a Rust resource, as we were (read: Claude was) doing a lot of crazy stuff with memory.

Lua

Lua is another language designed to be embedded. Unlike Starlark, it's not safely sandboxed by default; you need to remove the toys. ²

But many people have sandboxed Lua, so there's prior art for this.

Erlang has a library that implements Lua natively (written by Robert Virding, one of the co-inventors of Erlang no-less!) There is an Elixir library with bindings for it.

What's great about this approach is that it's very fast, the fastest so far. That's because Erlang implements Lua natively, so we're technically running in the host VM. No serialization, no data over the wire.

We were seeing performance in the 10s of microseconds.

However, Luerl needs some work on its errors and stacktraces. They can be inscrutable. We saw some of these in our testing.

Because few programmers are familiar with Lua, we know we need a good "playground" experience to develop transforms. Cryptic errors would make that difficult.

In addition, while the benefit is that this solution runs in our host VM, that also comes with a drawback: we'd need to build in some of our own sandboxing. For example, we'd need to account for users abusing memory usage.

Custom interpreter solutions

Our exploration of embedded languages showed promising results, particularly for performance.

However, we began to wonder if we could push the performance envelope even further while simplifying our implementation. This led us to explore our final approach: creating a custom-restricted interpreter based on our host language.

Restricted AST interpreter ("Mini-Elixir")

How might you create a safe sandbox for user code written in your host language?

The user writes code in the host language, in our case Elixir.
We parse their code (string) and turn it into an abstract syntax tree (AST). In Elixir, this means we now have their code represented as a bunch of nested tuples (just data).
We traverse these tuples. We only let them call functions or operators that exist in a whitelist that we've created. If they don't, we consider it a syntax error and refuse to compile and run their program.

By controlling which functions and operators a user can and can't use, you can create your own sandbox language. In our so-called "Mini-Elixir", users would have access to a tight subset of Elixir for their transforms. Naturally, they wouldn't have access to interfaces for the network or filesystem.

To write transforms, one doesn't need much. Just basic manipulation of strings, maps, and lists.

This approach is relatively simple to build and very simple to run. And because it's in our host language, it's fast. We don't need to serialize/deserialize to/fro another runtime. We can run user-generated transform functions in microseconds.

A single core could handle 100k+ transformations per second, making this by far the most efficient solution.

The risk with this approach is that every operator you allow opens up a new attack surface.

For example, you might think it's innocuous to allow for the <</>> operators in Elixir. "I'll let users construct binaries – what's the worst that can happen?"

Well, you can use <</>> to construct very large binaries with very few keystrokes. This is a bitstring that's an extraordinary 12.5 exabytes:

<<1::99999999999999999999>>

Same with operators like ... Ranges let a user get into cardinalities beyond the parameters you've bound for them.

Similarly, we can't let users write arbitrarily-sized programs, or create giant array literals. Every operator, even [ and ], have to come with their own defenses.

Solution comparison

After exploring various approaches, we compiled a comprehensive comparison to help us make our final decision:

Solution	Execution Time	Implementation Complexity	Runtime Complexity	User Experience	Security
Cloud Functions	1-10ms	High	Low	Great	Excellent
Docker/Firecracker	100-150μs	High	High	Good	Excellent
WASM	1-3ms	High	High	Complex	Very Good
WASM + Interpreter	~1ms	Medium	Low	Good	Very Good
Deno	200-500μs	Medium-High	Medium	Excellent	Very Good
Starlark	~500μs	Medium	Low	Good	Very Good
Lua	10-100μs	Low	Low	Not great	Moderate*
Restricted AST ("Mini-Elixir")	<10μs	Medium	None	Very Good	Good**

* Requires additional sandboxing measures
** Subject to careful operator selection and limits

The fastest solutions generally required more custom work to ensure proper security boundaries, while the most secure solutions (like cloud functions) came with performance penalties.

Our decision

After careful consideration, we realized multi-tenant code execution is the exception for us, not the rule.

Sequin can run in your local development environment, or in CI, or in a single-tenant deployment. In those contexts, the security requirements and trade-offs are very different from a multi-tenant deployment.

So, we decided we could implement a blended approach:

As a first step, we’ve implemented our own restricted AST interpreter—the "Mini-Elixir" approach. This works great locally or in single-tenancy deployments.

For our multi-tenant cloud, we’ll add an additional layer of security around “Mini-Elixir.” We will either:

Use Lambda: the latency of Lambda won't impact performance, as it won't be the limiting factor of a multi-tenant deployment anyway. (Multi-tenant deployments have other throughput limitations.)
Use Docker: we are most familiar with Docker vs other VM solutions. We could provision restricted Docker VMs for running user code in our cloud.

Importantly, users of Sequin wouldn't need to worry about Lambdas or Docker when running Sequin locally or self-hosting it. We like that this option gives us a simple runtime when Sequin is running in other contexts where we don't need that extra security.

Why start with Mini-Elixir?

Unmatched performance: At <10μs per execution, this approach offers performance that's orders of magnitude better than alternatives.
Operational simplicity: Unlike VM-based solutions, there's no need to manage container lifecycles, warm pools, or cross-process communication protocols.
Reduced security requirements in single-tenant deployments: In a single-tenant deployment, we can limit access at the machine level.

What makes Mini-Elixir difficult in multi-tenant deployments?

To address security concerns in a multi-tenant deployment, we have two primary safeguards:

Strict operator whitelist: We can carefully select which operators and functions to expose, excluding anything that could lead to unbounded resource consumption or worse.
Process isolation: By evaluating user code in spawned processes with strict heap size limits and timeouts, we can contain potential resource abuse.

Even still, we're aware of attack vectors for denial of service. For example, large binaries remain a theoretical attack vector as they're stored outside the process heap in Elixir. Someone could upload a function that creates a very large string and takes down one of our boxes.

These denial-of-service attacks are bad. But of course as an early cloud offering, many such denial-of-service attack vectors – both known and unknown – exist.

However, the primary motivation to use a VM to run code in our multi-tenant offering is to add an additional layer of security. For our single-tenant offering, that additional layer is the machine running on an isolated instance in our cloud. That layer doesn't exist in the multi-tenant model. While we feel good about restricting malicious code's access to file and network, we don't like that there's only a single layer of protection.

Mini-Elixir in action

Here is what writing a transform in our playground looks like. Elixir code is sent to the backend via a websocket, where it is validated, compiled, and executed in real-time:

0:00

/0:09

Conclusion

We think we're taking the pragmatic approach for rolling out our transforms. We're excited to offer a solution that will be so performant.

And we know this is likely just the first solution. Rolling out our first version of transforms will allow us to collect feedback and iterate.

Custom code is one of our biggest new features since launching Sequin. With embedded code, users now have:

Complete transformation control: Write custom functions that reshape data exactly as needed for downstream systems
Custom routing: Apply custom logic to determine where different messages should flow based on their content (e.g. which Kafka topic to write to)
Advanced filtering: Precisely define which messages matter for specific use cases, reducing noise and processing load (e.g. compare NEW and OLD values in a change)

All this comes with performance that scales to hundreds of thousands of operations per second on modest hardware—something that would be prohibitively expensive with many other approaches.

Try Sequin in a quickstart to see what it's all about.

Fun fact: Starlark does not have any datetime functionality, as datetime is non-deterministic and determinism is a language goal. ↩︎
The Redis CVE from 2022 was a result of Redis inadvertently leaving Lua's package accessible – this let attackers use package.loadlib to load arbitrary libraries and execute system commands. ↩︎