Last updated at Fri, 03 Nov 2017 14:12:46 GMT

YARN stands for Yet Another Resource Negotiator. According to Hortonworks, it is “the architectural center of Hadoop.” YARN is the stack layer that allows multiple processing engines, with varying workloads (streaming, SQL, batch, machine learning etc.) to run on top of the Hadoop file system, HDFS.

Before YARN there was another resource negotiator, Mesos

(Hence the “yet another” in YARN)

Mesos was originally documented in a research paper, followed by an Apache project, and now commercialized as Mesosphere – who are building out an ecosystem around Mesos.

Is Mesos yet another cloud OS?

mesos yet another cloud os?

In fairness to Mesosphere, their marketing doesn’t stray into the hyperbole of calling Mesos a Cloud OS (maybe a data center OS or a distributed systems OS). This makes sense because it would probably confuse people. However, they do talk about running your infrastructure “like one big computer” and running distributed applications “as if they were apps being launched on a laptop.” It seems clear this is a hugely ambitious endeavor by a company who have the vision to match.

In the past I’ve compared the hype around Docker Inc. to the fuss in 2012 around the billion-dollar acquisition of Nicira by VMware.

Now it seems clear that Mesosphere is a better candidate to wear the Nicira Most Disruptive Startup crown.

If Mesos is for resource scheduling across a cluster – how does this lead to the OS analogy?

The answer starts with the description on the Apache Mesos site, where Mesos is described as a distributed systems kernel, drawing parallels to the Linux kernel (in the same way that the Linux kernel gets confused with GNU/Linux OS). Running on this kernel is a set of frameworks (daemons or services in the OS-world) that allow people to build applications. These include Marathon (distributed init), Chronos (distributed cron), and Hadoop (distributed file system).

The kernel analogy makes more sense when you view Mesos as abstracting “CPU, memory, storage and other compute resources away from machines (physical or virtual).”

Mesor Architecture

The architecture is shown above, so I won’t repeat the description of how this works in terms of masters, slaves, schedulers etc. The diagram is lifted from the Apache project site with has a pretty clear description (which you can see here).

The concept of resource offers is the key to the architecture’s efficiency.

By allowing the frameworks to decide which resources to consume – subject to a plugin module in the master that defines the policy for applying (fair-sharing, Dominant Resource Fairness as per the DRF paper etc.) – multiple frameworks can compete for cluster resources.

Looking forward

In my next blog post I will focus on how to write a framework, including the role of the scheduler, executor, and how they are both implemented.

There is a ton we can cover in this area. In future posts I’ll dive into Google’s Kubernetes (with it’s promise of “elastic distributed microservices”) and why the Mesosphere team were so quick to ensure it was seen as being part of the Mesos ecosystem rather than a competing framework (embrace and extend maybe?).

I’d also like to why Mesosphere is working so closely with a customer like HubSpot but don’t seem to have embraced Singularity (at least not yet). Along the way we’ll move from the 20,000 ft view and get stuck in the weeds: we’ll explore isolation with LXC vs lmctfy, how to go about securing access to your cluster, how to write Mesos frameworks, and of course we need to look into how you make Singularity’s log watcher play well with Logentries!