My list of projects and technologies to get into the lab for learning and experimentation is never-ending. However, this holiday gave me just enough time to take another look at Microsoft Orleans, a technological kindred spirit with wasmCloud.
If you take a look at the Orleans documentation, you'll see that the first sentence is Orleans is a cross-platform framework for building robust, scalable distributed applications. At the highest level, this matches up directly with the mission statement of wasmCloud. While both wasmCloud and Orleans are trying to solve the same problem and give developers a better way to build distributed applications, they vary in some fundamental ways in the implementation.
The unit of deployment within Orleans is called a
grain1, and the host runtimes responsible for scheduling grains are called
silos. The default mode of a grain acts like a distributed object instance. A grain is represented by a combination of its type and its unique key, and, most importantly, a grain can only ever exist in one silo within a cluster. The Orleans runtime is responsible for using all kinds of smart technology to place grains within silos, and grains can be located within a given silo using a O(1) algorithm through the use of a distributed hash map.
This differs from wasmCloud in that a wasmCloud WebAssembly component (both wasmCloud and Orleans are inspired by the actor model) is a stateless thing that can (and should) exist in multiple host runtimes across a cluster (which wasmCloud calls a lattice).
Let's say you have a grain called
User and some client somewhere has requested (therefore activated) the
User grain with the key of
bob. Somewhere within an Orleans cluster, a .NET object instance is actually sitting in memory (along with clever proxies). If I want to get the
bob User first name, I would actually be able to request the
FirstName property of the object after getting a grain reference. This way of programming stateful objects closely matches existing .NET developer's mental models of how the world works, and so transitioning from monolithic development to distributed via Orleans is relatively painless.
The scale tradeoff here is that in Orleans, multiple requests for the same key of an object will queue up behind a single-threaded execution model while in stateless wasmCloud, it doesn't matter which thing you're acting on because the mental model is more like serverless where you activate compute on demand, pass in parameters, get results, and can then dispose of your compute. In Orleans, concurrent requests for the
bob user's compute will block single-threaded while in wasmCloud, the number of concurrent requests supported is exactly the number of running instances of the
User component running in the lattice, regardless of if there is a unique key at play. Neither of this approaches is objectively better or worse than the other: it boils down to stateless versus stateful.
The next difference between the two involves the capability model, reducing coupling, and eliminating boilerplate. Where wasmCloud uses high-level contracts satisfied at runtime by capability providers, Orleans has a concept called grain services which serve a similar purpose. A grain service is a .NET class that implements a specific interface. If you have an existing investment in .NET code that you want to expose to grains, this works well. wasmCloud lets you expose functionality to WebAssembly components, but because of the sandboxing and portability we already get with WebAssembly, capability provider language choice doesn't matter.
For many people coming from an OOP background, stateful programming feels more natural while a lot of folks with a more functional programming background (or just a lot of distributed systems exposure) tend to gravitate toward the stateless model. wasmCloud gives components state by allowing them to access things like key-value stores or relational databases via the capability provider model. This distills down to the Orleans developer being able to "just store state as member variables" and not worry about how it works versus wasmCloud developers being more deliberate in their choices for state management.
While exploring Orleans, I found the Adventure Game sample which was a ton of fun and brought back a lot of fond memories from my days as a MUD developer. The adventure game sample illustrates very clearly what it looks like to build stateful components for a distributed application framework.
Take a look at some of this code from the sample's
Task IRoomGrain.Enter(PlayerInfo player)
_players.RemoveAll(x => x.Key == player.Key);
_players variable is initialized as follows:
private readonly List<PlayerInfo> _players = new();
Nowhere in this code is any reference to the notion that this is a distributed instance that is scheduled by a runtime and started on a cluster node behind a networking proxy to allow for targeted invocations.
This is how distributed computing should work. Developers should focus on their business logic and leave the distribution, clustering, resilience, scale, and everything else to a platform that specializes in that sort of thing.
That said, I can actually think of different use cases where people might want to use different models. Sometimes deploying a horde of stateless units of compute makes sense, but other times maybe it's just more appropriate (like with the "Adventure" game) to treat instances of "things" as distributed, globally identified objects.
This got me wondering about whether wasmCloud could support both models 🤔. What would it look like if we could have regular, highly resilient, scalable, distributed and stateless components, and allow for host-pegged stateful instances to be available for calling?
In this Rust code, you can assume that there has been some plumbing code auto-generated at compile time and a function like
get_component locates a stateful WebAssembly component by type and key:
// Alternatively: let the_arena = get_component<KevinsArena>("arena12");
let the_arena: KevinsArena = get_component("arena12")?;
let player: Player = get_component("bob")?;
In this preceding code, the
arena12 instance of the
KevinsArena type might be sitting on
host_1 within a cluster/lattice, while the
bob player is sitting on
host_2, and both of them were activated on demand by the above client code and those instances could be deactivated automatically by scheduling rules. This is conceptually the opposite of the lambda/serverless/stateless paradigm.
When we hide the details of how something works in the interest of making things easy, we can sometimes make it so easy that we don't see the consequences of our actions and decisions. For example, seemingly innocent calls can generate massive amounts of network traffic depending on node/silo/host allocation.
An innocent call to
kevin.attack(bob)?; could unleash a barrage of remote calls. The code needs to notify every other player in a room that kevin has attacked bob, bob needs to be notified of the impending attack and suitable code needs to react. The weapons and armor in the combatants' inventory are activated and invoked. As the two exchange blows and cast spells, more and more calls and casts spew from the firehose. If the room, its occupants, and the inventory of each occupant, are all spread wide across a cluster, this will produce a massive amount of high-latency traffic. Orleans can optimize the location of object instances (for example, it can prefer to activate locally if the instance isn't running elsewhere in the cluster yet), but you can imagine how trying to optimize this object/grain allocation can be a nightmarish problem to solve, especially if the optimization rules have multiple conflicting priorities.
Today's wasmCloud stateless style looks radically different because there should be no discernible difference between any two instances of the same component. Everything is stateless. A wasmCloud approach to the above problem might look like this:
let arena_sender = ArenaSender::new();
The component that handles the arena messages would then have to communicate with a stateful store facilitated by a capability provider in order to query and make persistent changes.
There is an optimal time and place for both styles as each one carries with it a very powerful implied mental model. The former (to me, anyway) feels more like traditional monolithic OOP whereas the latter feels familiar to me as the type of code written for global-scale, stateless, highly resilient applications.
Orleans makes the proposition that the first style can be written without sacrificing the benefits of an underlying distributed platform. This blog post has mentioned a couple of the scaling tradeoffs involved in this bargain.
Deciding whether you want to use the stateful versus stateless mental model depends heavily on whether you have a lot of logic that operates on keyed entities rather than processes messages, events, or other streams of data. As a game developer, I'd probably prefer the stateful component/grain approach, but I prefer the stateless paradigm when building reactive systems that need to process the same kind of work in parallel.
No matter what you think of stateful vs. stateless, Orleans is a lot of fun to play with and I highly recommend exploring it if you've got some spare time. Where Orleans makes many things possible because it leverages the notion that everything is written in languages that target the Common Language Runtime (CLR), wasmCloud gains many of its powerful features by leveraging the size, language agnosticism, portability, speed, and security of WebAssembly components. While today it's not easy for wasmCloud/WebAssembly to leverage .NET, that should improve in the future.
If you're interested in experimenting with wasmCloud, I recommend trying out the Cosmonic application platform by signing up for our free developer preview.