wasmCloud and .NET Orleans: Kindred Spirits

My list of projects and technologies to get into the lab for learning and experimentation is never-ending. However, this holiday gave me just enough time to take another look at Microsoft Orleans, a technological kindred spirit with wasmCloud.

If you take a look at the Orleans documentation, you'll see that the first sentence is Orleans is a cross-platform framework for building robust, scalable distributed applications. At the highest level, this matches up directly with the mission statement of wasmCloud. While both wasmCloud and Orleans are trying to solve the same problem and give developers a better way to build distributed applications, they vary in some fundamental ways in the implementation.

The unit of deployment within Orleans is called a grain¹, and the host runtimes responsible for scheduling grains are called silos. The default mode of a grain acts like a distributed object instance. A grain is represented by a combination of its type and its unique key, and, most importantly, a grain can only ever exist in one silo within a cluster. The Orleans runtime is responsible for using all kinds of smart technology to place grains within silos, and grains can be located within a given silo using a O(1) algorithm through the use of a distributed hash map.

This differs from wasmCloud in that a wasmCloud WebAssembly component (both wasmCloud and Orleans are inspired by the actor model) is a stateless thing that can (and should) exist in multiple host runtimes across a cluster (which wasmCloud calls a lattice).

Let's say you have a grain called User and some client somewhere has requested (therefore activated) the User grain with the key of bob. Somewhere within an Orleans cluster, a .NET object instance is actually sitting in memory (along with clever proxies). If I want to get the bob User first name, I would actually be able to request the FirstName property of the object after getting a grain reference. This way of programming stateful objects closely matches existing .NET developer's mental models of how the world works, and so transitioning from monolithic development to distributed via Orleans is relatively painless.

The scale tradeoff here is that in Orleans, multiple requests for the same key of an object will queue up behind a single-threaded execution model while in stateless wasmCloud, it doesn't matter which thing you're acting on because the mental model is more like serverless where you activate compute on demand, pass in parameters, get results, and can then dispose of your compute. In Orleans, concurrent requests for the bob user's compute will block single-threaded while in wasmCloud, the number of concurrent requests supported is exactly the number of running instances of the User component running in the lattice, regardless of if there is a unique key at play. Neither of this approaches is objectively better or worse than the other: it boils down to stateless versus stateful.

The next difference between the two involves the capability model, reducing coupling, and eliminating boilerplate. Where wasmCloud uses high-level contracts satisfied at runtime by capability providers, Orleans has a concept called grain services which serve a similar purpose. A grain service is a .NET class that implements a specific interface. If you have an existing investment in .NET code that you want to expose to grains, this works well. wasmCloud lets you expose functionality to WebAssembly components, but because of the sandboxing and portability we already get with WebAssembly, capability provider language choice doesn't matter.

For many people coming from an OOP background, stateful programming feels more natural while a lot of folks with a more functional programming background (or just a lot of distributed systems exposure) tend to gravitate toward the stateless model. wasmCloud gives components state by allowing them to access things like key-value stores or relational databases via the capability provider model. This distills down to the Orleans developer being able to "just store state as member variables" and not worry about how it works versus wasmCloud developers being more deliberate in their choices for state management.

While exploring Orleans, I found the Adventure Game sample which was a ton of fun and brought back a lot of fond memories from my days as a MUD developer. The adventure game sample illustrates very clearly what it looks like to build stateful components for a distributed application framework.

Take a look at some of this code from the sample's Room grain:

Task IRoomGrain.Enter(PlayerInfo player)
{
    _players.RemoveAll(x => x.Key == player.Key);
    _players.Add(player);
    return Task.CompletedTask;
}

The _players variable is initialized as follows:

private readonly List<PlayerInfo> _players = new();

Nowhere in this code is any reference to the notion that this is a distributed instance that is scheduled by a runtime and started on a cluster node behind a networking proxy to allow for targeted invocations.

The distributed computing dream

This is how distributed computing should work. Developers should focus on their business logic and leave the distribution, clustering, resilience, scale, and everything else to a platform that specializes in that sort of thing.

That said, I can actually think of different use cases where people might want to use different models. Sometimes deploying a horde of stateless units of compute makes sense, but other times maybe it's just more appropriate (like with the "Adventure" game) to treat instances of "things" as distributed, globally identified objects.

This got me wondering about whether wasmCloud could support both models 🤔. What would it look like if we could have regular, highly resilient, scalable, distributed and stateless components, and allow for host-pegged stateful instances to be available for calling?

In this Rust code, you can assume that there has been some plumbing code auto-generated at compile time and a function like get_component locates a stateful WebAssembly component by type and key:

// Alternatively: let the_arena = get_component<KevinsArena>("arena12");
let the_arena: KevinsArena = get_component("arena12")?;
let player: Player = get_component("bob")?;
player.move(&the_arena)?;

In this preceding code, the arena12 instance of the KevinsArena type might be sitting on host_1 within a cluster/lattice, while the bob player is sitting on host_2, and both of them were activated on demand by the above client code and those instances could be deactivated automatically by scheduling rules. This is conceptually the opposite of the lambda/serverless/stateless paradigm.

When we hide the details of how something works in the interest of making things easy, we can sometimes make it so easy that we don't see the consequences of our actions and decisions. For example, seemingly innocent calls can generate massive amounts of network traffic depending on node/silo/host allocation.

An innocent call to kevin.attack(bob)?; could unleash a barrage of remote calls. The code needs to notify every other player in a room that kevin has attacked bob, bob needs to be notified of the impending attack and suitable code needs to react. The weapons and armor in the combatants' inventory are activated and invoked. As the two exchange blows and cast spells, more and more calls and casts spew from the firehose. If the room, its occupants, and the inventory of each occupant, are all spread wide across a cluster, this will produce a massive amount of high-latency traffic. Orleans can optimize the location of object instances (for example, it can prefer to activate locally if the instance isn't running elsewhere in the cluster yet), but you can imagine how trying to optimize this object/grain allocation can be a nightmarish problem to solve, especially if the optimization rules have multiple conflicting priorities.

Today's wasmCloud stateless style looks radically different because there should be no discernible difference between any two instances of the same component. Everything is stateless. A wasmCloud approach to the above problem might look like this:

let arena_sender = ArenaSender::new();
arena_sender.handle_move(Move{
    entity: "bob",
    entity_type: EntityType::Player,
    target_arena: "arena12",
    ...
})?;

The component that handles the arena messages would then have to communicate with a stateful store facilitated by a capability provider in order to query and make persistent changes.

There is an optimal time and place for both styles as each one carries with it a very powerful implied mental model. The former (to me, anyway) feels more like traditional monolithic OOP whereas the latter feels familiar to me as the type of code written for global-scale, stateless, highly resilient applications.

Orleans makes the proposition that the first style can be written without sacrificing the benefits of an underlying distributed platform. This blog post has mentioned a couple of the scaling tradeoffs involved in this bargain.

Deciding whether you want to use the stateful versus stateless mental model depends heavily on whether you have a lot of logic that operates on keyed entities rather than processes messages, events, or other streams of data. As a game developer, I'd probably prefer the stateful component/grain approach, but I prefer the stateless paradigm when building reactive systems that need to process the same kind of work in parallel.

No matter what you think of stateful vs. stateless, Orleans is a lot of fun to play with and I highly recommend exploring it if you've got some spare time. Where Orleans makes many things possible because it leverages the notion that everything is written in languages that target the Common Language Runtime (CLR), wasmCloud gains many of its powerful features by leveraging the size, language agnosticism, portability, speed, and security of WebAssembly components. While today it's not easy for wasmCloud/WebAssembly to leverage .NET, that should improve in the future.

If you're interested in experimenting with wasmCloud, I recommend following the wasmCloud tour to build a simple distributed application.

Want to try wasmCloud on a managed platform?

Get in touch for a demo

Book Now

Not to be confused with the Grain programming language or its silo package manager. ↩

Want to try wasmCloud on a managed platform?

Footnotes​

Keep up to date

Footnotes