Friday, May 20, 2011

On Noah - Part 4

On Noah - Part 4

This is the fourth part in a series on Noah. Part 1, Part 2 and Part 3 are available as well

In Part 1 and 2 of this series I covered background on Zookeeper and discussed the similarities and differences between it and Noah. Part 3 was about the components underneath Noah that make it tick.

This post is about the "future" of Noah. Since I'm a fan of Fourcast podcast, I thought it would be nice to do an immediate, medium and long term set of goals.

Immediate Future - the road to 1.0

In the most immediate future there are a few things that need to happen. These are in no specific order.

  • General
    • Better test coverage ESPECIALLY around the watch subsystem
    • Full code comment coverage
    • Chef cookbooks/Puppet manifests for doing a full install
    • "fatty" installers for a standalone server
    • Documentation around operational best practices
    • Documentation around clustering, redundancy and hadr
    • Documentation around integration best practices
    • Performance testing
  • Noah Server
    • Expiry flags and reaping for Ephemerals
    • Convert mime-type in Configurations to make sense
    • Untag and Unlink support
    • Refactor how you specify Redis connection information
    • Integrated metrics for monitoring (failed callbacks, expired ephemeral count, that kind of stuff)
  • Watcher callback daemon
    • Make the HTTP callback plugin more flexible
    • Finish binscript for the watcher daemon
  • Other
    • Finish Boat
    • Finish NoahLite LWRP for Chef (using Boat)
    • A few more HTTP-based callback plugins (Rundeck, Jenkins)

Now that doesn't look like a very cool list but it's a lot of work for one person. I don't blame anyone for not getting excited about it. The goal now is to get a functional and stable application out the door that people can start using. Mind you I think it's usable now (and I'm already using it in "production").

Obviously if anyone has something else they'd like to see on the list, let me know.

Medium Rare

So beyond that 1.0 release, what's on tap? Most of the work will probably occur around the watcher subsystem and the callback daemon. However there are a few key server changes I need to implement.

  • Server
    • Full ACL support on every object at every level
    • Token-based and SSH key based credentialing
    • Optional versioning on every object at every level
    • Accountability/Audit trail
    • Implement a long-polling interface for inband watchers
  • Watcher callback daemon
    • Decouple the callback daemon from the Ruby API of the server. Instead the daemon itself needs to be a full REST client of the Noah server
    • Break out the "official" callback daemon into a distinct package
  • Clients
    • Sinatra Helper

Also during this period, I want to spend time building up the ecosystem as a whole. You can see a general mindmap of that here.

Going into a bit more detail...

Tokens and keys

It's plainly clear that something which has the ability to make runtime environment changes needs to be secure. The first thing to roll off the line post-1.0 will be that functionality. Full ACL support for all entries will be enabled and in can be set at any level in the namespace just the same as Watches.

Versioning and Auditing

Again for all entires and levels in the namespace, versioning and auditing will be allowed. The intention is that the number of revisions and audit entries are configurable as well - not just an enable/disable bit.

In-band watches

While I've lamented the fact that watches were in-band only in Zookeeper, there's a real world need for that model. The idea of long-polling functionality is something I'd actually like to have by 1.0 but likely won't happen. The intent is simply that when you call say /some/path/watch, you can pass an optional flag in the message stating that you want to watch that endpoint for a fixed amount of time for any changes. Optionally a way to subscribe to all changes over long-polling for a fixed amount of time is cool too.

Agent changes

These two are pretty high on my list. As I said, there's a workable solution with minimal tech debt going into the 1.0 release but long term, this needs to be a distinct package. A few other ideas I'm kicking around are allowing configurable filtering on WHICH callback types an agent will handle. The idea is that you can specify that this invocation only handle http callbacks while this other one handles AMQP.

Sinatra Helper

One idea I'd REALLY like to come to fruition is the Sinatra Helper. I envision it working something like this:

    require 'sinatra/base'

class MyApp < Sinatra::Base
register Noah::Sinatra

noah_server "http://localhost:5678"
noah_node_name "myself"
noah_app_name "MyApp"
noah_token "somerandomlongstring"

dynamic_get :database_server
dynamic_set :some_other_variable, "foobar"
watch :this_other_node

end

The idea is that the helper allows you to register your application very easily with Noah for other components in your environment to be know. As a byproduct, you get the ability to get/set certain configuration parameters entirely in Noah. The watch setting is kind of cool as well. What will happen is if you decide to watch something this way, the helper will create a random (and yes, secure) route in your application that watch events can notify. In this way, your Sinatra application can be notified of any changes and will automatically "reconfigure" itself.

Obviously I'd love to see other implementations of this idea for other languages and frameworks.

Long term changes

There aren't so much specific list items here as general themes and ideas. While I list these as long term, I've already gotten an offer to help with some of them so they might actually get out sooner.

Making Noah itself distributed

This is something I'm VERY keen on getting accomplished and would really consider it the fruition of what Noah itself does. The idea is simply that multiple Noah servers themselves are clients of other Noah servers. I've got several ideas about how to accomplish this but I got an interesting follow up from someone on Github the other day. He asked what my plans were in this area and we had several lengthy emails back and forth including an offer to work on this particular issue.

Obviously there are a whole host of issues to consider. Race conditions in ordered delivery of Watch callbacks (getting a status "down" after a status "up" when it's supposed to be the other way around..) and eventual consistency spring to mind first.

The general architecture idea that was offered up is to use NATS as the mechanism for accomplishing this. In the same way that there would be AMQP callback support, there would be NATS support. Additional Noah servers would only need to know one other member to bootstrap and everything else happens using the natural flows within Noah.

The other part of that is how to handle the Redis part. The natural inclination is to use the upcoming Redis clustering but that's not something I want to do. I want each Noah server to actually include its OWN Redis instance "embedded" and not need to rely on any external mechanism for replication of the data. Again, the biggest validation of what Noah is designed to do is using only Noah itself to do it.

Move off Redis/Swappable persistence

If NATS says anything to me, it says "Why do you even need Redis?". If you recall, I went with Redis because it solved multiple problems. If I can find a persistence mechanism that I can use without any external service running, I'd love to use it.

ZeroMQ

If I were to end up moving off Redis, I'd need a cross platform and cross language way to handle the pubsub component. NATS would be the first idea but NATS is Ruby only (unless I've missed something). ZeroMQ appears to have broad language and platform support so writing custom agents in the same vein as the Redis PUBSUB method should be feasible.

Nanite-style agents

This is more of a command-and-control topic but a set of high-performance specialized agents on systems that can watch the PUBSUB backend or listen for callbacks would be awesome. This would allow you really integrate Noah into your infrastructure beyond the application level. Use it to trigger a puppet or chef run, reboot instances or do whatever. This is really about bringing what I wanted to accomplish with Vogeler into Noah.

The PAXOS question

A lot of people have asked me about this. I'll state right now that I can only make it through about 20-30% of any reading about Paxos before my brain starts to melt. However in the interest of proving myself the fool, I think it would be possible to implement some Paxos like functionality on top of Noah. Remember that Noah is fundamentally about fully disconnected nodes. What better example of a network of unreliable processors than ones that never actually talk to each other. The problem is that the use case for doing it in Noah is fairly limited so as not to be worth it.

The grand scheme is that Noah helps enable the construction of systems where you can say "This component is free to go off and operate in this way secure in the knowledge that if something it needs to know changes, someone will tell it". I did say "grand" didn't I? At some point, I may hit the limit of what I can do using only Ruby. Who knows.

Wrap up - Part 4

Again with the recap

  • Get to 1.0 with a stable and fixed set of functionality
  • Nurture the Noah ecosystem
  • Make it easy for people to integrate Noah into thier applications
  • Get all meta and make Noah itself distributed using Noah
  • Minimize the dependencies even more
  • Build skynet

I'm not kidding on that last one. Ask me about Parrot AR drones and Noah sometime

If you made it this far, I want to say thank you to anyone who read any or all of the parts. Please don't hesitate to contact me with any questions about the project.

Wednesday, May 18, 2011

On Noah - Part 3

On Noah - Part 3

This is the third part in a series on Noah. Part 1 and Part 2 are available as well

In Part 1 and 2 of this series I covered background on Zookeeper and discussed the similarities and differences between it and Noah. This post is discussing the technology stack under Noah and the reasoning for it.

A little back story

I've told a few people this but my original intention was to use Noah as a way to learn Erlang. However this did not work out. I needed to get a proof of concept out much quicker than the ramp up time it would take to learn me some Erlang. I had this grandiose idea to slap mnesia, riak_core and webmachine into a tasty ball of Zookeeper clonage.

I am not a developer by trade. I don't have any formal education in computer science (or anything for that matter). The reason I mention this is to say that programming is hard work for me. This has two side effects:

  • It takes me considerably longer than a working developer to code what's in my head
  • I can only really learn a new language when I have an itch to scratch. A real world problem to model.

So in the interest of time, I fell back to a language I'm most comfortable with right now, Ruby.

Sinatra and Ruby

Noah isn't so much a web application as it is this 'api thing'. There's no proper front end and honestly, you guys don't want to see what my design deficient mind would create. I like to joke that in the world of MVC, I stick to the M and C. Sure, APIs have views but not in the "click the pretty button sense".

I had been doing quite a bit of glue code at the office using Sinatra (and EventMachine) so I went with that. Sinatra is, if you use sheer number of clones in other languages as an example, a success for writing API-only applications. I also figured that if I wanted to slap something proper on the front, I could easily integrate it with Padrino.

But now I had to address the data storage issue.

Redis

Previously, as a way to learn Python at another company, I wrote an application called Vogeler. That application had a lot of moving parts - CouchDB for storage and RabbitMQ for messaging.

I knew from dealing with CouchDB on CentOS5 that I wasn't going to use THAT again. Much of it would have been overkill for Noah anyway. I realized I really needed nothing more than a key/value store. That really left me with either Riak or Redis. I love Riak but it wasn't the right fit in this case. I needed something with a smaller dependency footprint. Mind you Riak is VERY easy to install but managing Erlang applications is still a bit edgy for some folks. I needed something simpler.

I also realized early on that I needed some sort of basic queuing functionality. That really sealed Redis for me. Not only did it have zero external dependencies, but it also met the needs for queuing. I could use lists as dedicated direct queues and I could use the built-in pubsub as a broadcast mechanism. Redis also has a fast atomic counter that could be used to approximate the ZK sequence primitive should I want to do that.

Additionally, Redis has master/slave (not my first choice) support for limited scaling as well as redundancy. One of my original design goals was that Noah behave like a traditional web application. This is a model ops folks understand very well at this point.

EventMachine

When you think asynchronous in the Ruby world, there's really only one tool that comes to mind, EventMachine. Noah is designed for asynchronous networks and is itself asynchronous in its design. The callback agent itself uses EventMachine to process watches. As I said previously, this is simply using an EM friendly Redis driver that can do PSUBSCRIBE (using em-hiredis) and send watch messages (using em-http-request since we only support HTTP by default).

Ohm

Finally I slapped Ohm on top as the abstraction layer for Redis access. Ohm, if you haven't used it, is simply one of if not the best Ruby library for working with Redis. It's easily extensible, very transparent and frankly, it just gets the hell out of your way. A good example of this is converting some result to a hash. By default, Ohm only returns the id of the record. Nothing more. It also makes it VERY easy to drop past the abstraction and operate on Redis directly. It even provides helpers to get the keys it uses to query Redis. A good example of this is in the Linking and Tagging code. The following is a method in the Tag model:

    def members=(member)
self.key[:members].sadd(member.key)
member.tag! self.name unless member.tags.member?(self)
end

Because Links and Tags are a one-to-many across multiple models, I drop down to Redis and use sadd to add the object to a Redis set of objects sharing the same tag.

It also has a very handy feature which is how the core of Watches are done. You can define hooks at any phase of Redis interaction - before and after saves, creates, updates and deletes. the entire Watch system is nothing more than calling these post hooks to format the state of the object as JSON, add metadata and send the message using PUBLISH messages to Redis with the Noah namespace as the channel.

Distribution vectors

I've used this phrase with a few people. Essentially, I want as many people as possible to be able to use the Noah server component. I've kept the Ruby dependencies to a minimum and I've made sure that every single one works on MRI 1.8.7 up to 1.9.2 as well as JRuby. I already distribute the most current release as a war that can be deployed to a container or run standalone. I want the lowest barrier to entry to get the broadest install base possible. When a new PaaS offering comes out, I pester the hell out of anyone I can find associated with it so I can get deploy instructions written for Noah. So far you can run it on Heroku (using the various hosted Redis providers), CloudFoundry and dotcloud.

I'm a bit more lax on the callback daemon. Because it can be written in any language that can talk to the Redis pubsub system and because it has "stricter" performance needs, I'm willing to make the requirements for the "official" daemon more stringent. It currently ONLY works on MRI (mainly due to the em-hiredis requirement).

Doing things differently

Some people have asked me why I didn't use technology A or technology B. I think I addressed that mostly above but I'll tackle a couple of key ones.

ZeroMQ

The main reason for not using 0mq was that I wasn't really aware of it. Were I to start over and still be using Ruby, I'd probably give it a good strong look. The would still be the question of the storage component though. There's still a possible place for it that I'll address in part four.

NATS

This was something I simply had no idea about until I started poking around the CloudFoundry code base. I can almost guarantee that NATS will be a part of Noah in the future. Expect much more information about that in part four.

MongoDB

You have got to be kidding me, right? I don't trust my data (or anyone else's for that matter) to a product that doesn't understand what durability means when we're talking about databases.

Insert favorite data store here

As I said, Redis was the best way to get multiple required functionality into a single product. Why does a data storage engine have a pubsub messaging subsystem built in? I don't know off the top of my head but I'll take it.

Wrap up - Part 3

So again, because I evidently like recaps, here's the take away:

  • The key components in Noah are Redis and Sinatra
  • Noah is written in Ruby because of time constraints in learning a new language
  • Noah strives for the server component to have the broadest set of distribution vectors as possible
  • Ruby dependencies are kept to a minimum to ensure the previous point
  • The lightest possible abstractions (Ohm) are used.
  • Stricter requirements exist for non-server components because of flexibility in alternates
  • I really should learn me some erlang
  • I'm not a fan of MongoDB

If you haven't guessed, I'm doing one part a night in this series. Tomorrow is part four which will cover the future plans for Noah. I'm also planning on a bonus part five to cover things that didn't really fit into the first four.

Tuesday, May 17, 2011

On Noah - Part 2

On Noah - Part 2

This is the second part in a series on Noah. Part 1 is available here

In part one of this series, I went over a little background about ZooKeeper and how the basic Zookeeper concepts are implemented in Noah. In this post, I want to go over a little bit about a few things that Noah does differently.

Noah Primitives

As mentioned in the previous post, Noah has 5 essential data types, four of which are what I've interchangeably refered to as either Primitives and Opinionated models. The four primitives are Host, Service, Application and Configuration. The idea was to map some common use cases for Zookeeper and Noah onto a set of objects that users would find familiar.

You might detect a bit of Nagios inspiration in the first two.

Host
Analogous to a traditional host or server. The machine or instance running the operating system. Unique by name.
Service
Typically mapped to something like HTTP or HTTPS. Think of this as the listening port on a Host. Services must be bound to Hosts. Unique by service name and host name.
Application
Apache, your application (rails, php, java, whatever). There's a subtle difference here from Service. Unique by name.
Configuration
A distinct configuration element. Has a one-to-many relationship with Applications. Supports limited mime typing.

Hosts and Services have a unique attribute known as status. This is a required attribute and is one of up,down or pending. These primitives would work very well integrated into the OS init process. Since Noah is curl-friendly, you could add something globally to init scripts that updated Noah when your host is starting up or when some critical init script starts. If you were to imagine Noah primitives as part of the OSI model, these are analagous to Layers 2 and 3.

Applications and Configurations are intended to feel more like Layer 7 (again, using our OSI model analogy). The differentiation is that your application might be a Sinatra or Java application that has a set of Configurations associated with it. Interestingly enough, you might choose to have something like Tomcat act as both a Service AND an Application. The aspect of Tomcat as a Service is different than the Java applications running in the container or even Tomcat's own configurations (such as logging).

One thing I'm trying to pull off with Configurations is limited mime-type support. When creating a Configuration in Noah, you can assign a format attribute. Currently 3 formats or types are understood:

  • string
  • json
  • yaml

The idea is that, if you provide a type, we will serve that content back to you in that format when you request it (assuming you request it that way via HTTP headers). This should allow you to skip parsing the JSON representation of the whole object and instead use it directly. Right now this list is hardcoded. I have a task to convert this.

Hosts and Services make a great "canned" structure for building a monitoring system on top of Noah. Applications and Configurations are a lightweight configuration management system. Obviously there are more uses than that but it's a good way to look at it.

Ephemerals

Ephemerals, as mentioned previously, are closer to what Zookeeper provides. The way I like to describe Ephemerals to people is a '512 byte key/value store with triggers' (via Watch callbacks). If none of the Primitives fit your use case, the Ephemerals make a good place to start. Simply send some data in the body of your post to the url and the data is stored there. No attempt is made to understand or interpret the data. The hierarchy of objects in the Ephemeral namespace is completely arbitrary. Data living at /ephemerals/foo has no relationship with data living at /ephemerals/foo/bar.

Ephemerals are also not browseable except via a Linking and Tagging.

Links and Tags are, as far as I can tell, unique to Noah compared to Zookeeper. Because we namespace against Primitives and Ephemerals, there existed the need to visualize objects under a custom hierarchy. Currently Links and Tags are the only way to visualize Ephemerals in a JSON format.

Tags are pretty standard across the internet by now. You might choose to tag a bunch of items as production or perhaps group a set of Hosts and Services as out-of-service. Tagging an item is a simple process in the API. Simply PUT the name of the tag(s) to the url of a distinct named item appended by tag. For instance, the following JSON posted to /applications/my_kick_ass_app/tag with tag the Application my_kick_ass_app with the tags sinatra, production and foobar:

{"tags":["sinatra", "production", "foobar"]}

Links work similar to Tags (including the act of linking) except that the top level namespace is now replaced with the name of the Link. The top level namespace in Noah for the purposes of Watches is //noah. By linking a group of objects together, you will be able to (not yet implemented), perform operations such as Watches in bulk. For instance, if you wanted to be informed of all changes to your objects in Noah, you would create a Watch against //noah/*. This works fine for most people but imagine you wanted a more multi-tenant friendly system. By using links, you can group ONLY the objects you care about and create the watch against that link. So //noah/* becomes //my_organization/* and only those changes to items in that namespace will fire for that Watch.

The idea is also that other operations outside of setting Watches can be applied to the underlying object in the link as well. The name Link was inspired by the idea of symlinking.

Watches and Callbacks

In the first post, I mentioned that by nature of Noah being "disconnected", Watches were persistent as opposed to one-shot. Additionally, because of the pluggable nature of Noah Watches and because Noah has no opinion regarding the destination of a fired Watch, it becomes very easy to use Noah as a broadcast mechanism. You don't need to have watches for each interested party. Instead, you can create a callback plugin that could dump the messages on an ActiveMQ Fanout queue or AMQP broadcast exchange. You could even use multicast to notify multiple interested parties at once.

Again, the act of creating a watch and the destination for notifications is entirely disconnected from the final client that might use the information in that watch event.

Additionally, because of how changes are broadcast internally to Noah, you don't even have to use the "official" Watch method. All actions in Noah are published post-commit to a pubsub queue in Redis. Any language that supports Redis pubsub can attach directly to the queue and PSUBSCRIBE to the entire namespace or a subset. You can write your own engine for listening, filtering and notifying clients.

This is exactly how the Watcher daemon works. It attaches to the Redis pubsub queue, makes a few API calls for the current registered set of watches and then uses the watches to filter messages. When a new watch is created, that message is like any other change in Noah. The watcher daemon sees that and immediately adds it to its internal filter. This means that you can create a new watch, immediately change the watched object and the callback will be made.

Wrap up - Part Two

So to wrap up:

  • Noah has 5 basic "objects" in the system. Four of those are opinionated and come with specific contracts. The other is a "dumb" key/value store of sorts.
  • Noah provides Links and Tags as a way to perform logical grouping of these objects. Links replace the top-level hierarchy.
  • Watches are persistent. The act of creating a watch and notifying on watched objects is disconnected from the final recipient of the message. System A can register a watch on behalf of System B.
  • Watches are nothing more than a set of filters applied to a Redis pubsub queue listener. Any language that supports Redis and its pubsub queue can be a processor for watches.
  • You don't even have to register any Watches in Noah if you choose to attach and filter yourself.

Part three in this series will discuss the technology stack under Noah and the reasoning behind it. A bit of that was touched on in this post. Part four is the discussion about long-term goals and roadmaps.

Monday, May 16, 2011

On Noah - Part 1

On Noah - Part 1

This is the first part in a series of posts going over Noah

As you may have heard (from my own mouth no less), I've got a smallish side project I've been working on called Noah.

It's a project I've been wanting to work on for a long time now and earlier this year I got off my ass and started hacking. The response has been nothing short of overwhelming. I've heard from so many people how they're excited for it and nothing could drive me harder to work on it than that feedback. To everyone who doesn't run away when I talk your ear off about it, thank you so much.

Since I never really wrote an "official" post about it, I thought this would be a good opportunity to talk about what it is, what my ideas are and where I'd like to see it go in the future.

So why Noah?

fair warning. much of the following may be duplicates of information in the Noah wiki

The inspiration for Noah came from a few places but the biggest inspiration is Apache Zookeeper. Zookeeper is one of those things that by virtue of its design is a BUNCH of different things. It's all about perspective. I'm going to (yet again) paste the description of Zookeeper straight from the project site:

ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.

Now that might be a bit confusing at first. Which is it? Is it a configuration management system? A naming system? It's all of them and, again, it's all about perspective.

Zookeeper, however, has a few problems for my standard use case.

  • Limited client library support
  • Requires persistent connections to the server for full benefit

By the first, I mean that the only official language bindings are C and Java. There's contributed Python support and Twitter maintains a Ruby library. However all of these bindings are "native" and must be compiled. There is also a command-line client that you can use for interacting as well - one in Java and two C flavors.

The second is more of a showstopper. Zookeeper uses the client connection to the server as in-band signaling. This is how watches (discussed in a moment) are communicated to clients. Persistent connections are simply not always an option. I can't deploy something to Heroku or AppEngine that requires that persistent connection. Even if I could, it would be cost-prohibitive and honestly wouldn't make sense.

Looking at the list of features I loved about ZK, I thought "How would I make that work in the disconnected world?". By that I mean what would it take to implement any or all of the Zookeeper functionality as a service that other applications could use?

From that thought process, I came up with Noah. The name is only a play on the concept of a zookeeper and holds no other real significance other than irritation at least two people named Noah when I talk about the project.

So working through the feature list, I came up with a few things I REALLY wanted. I wanted Znodes, Watches and I wanted to do it all over HTTP so that I could have the broadest set of client support. JSON is really the defacto standard for web "messaging" at this point so that's what I went with. Basically the goal was "If your language can make HTTP requests and parse JSON, you can write a Noah client"

Znodes and Noah primitives

Zookeeper has a shared hierarchical namespace similar to a UNIX filesystem. Points in the hierarchy are called znodes. Essentially these are arbitrary paths where you can store bits of data - up to 1MB in size. These znodes are unique absolute paths. For instance:

    /
/systems
/foo
/bar
/networks
/kansas
/router-1
/router-2

Each fully qualified path is a unique znode. Znodes can be ephemeral or persistent. Zookeeper also has some primitives that can be applied to znodes such as 'sequence`.

When I originally started working on Noah, so that I could work with a model, I created some base primitives that would help me demonstrate an example of some of the use cases:

  • Host
  • Service
  • Application
  • Configuration

These primitives were actual models in the Noah code base with a strict contract on them. As an example, Hosts must have a status and can have any number of services associated with them. Services MUST be tied explicity to a host. Applications can have Configurations (or not) and Configurations can belong to any number of Applications or not. Additionally, I had another "data type" that I was simply calling Ephemerals. This is similar to the Zookeeper znode model. Originally I intended for Ephemerals to be just that - ephemeral. But I've backed off that plan. In Noah, Ephemerals can be either persistent or truely ephemeral (not yet implemented).

So now I had a data model to work with. A place to store information and flexibility to allow people to use the predefined primitives or the ephemerals for storing arbitrary bits of information.

Living the disconnected life

As I said, the model for my implementation was "disconnected". When thinking about how to implement Watches in a disconnected model, the only thing that made sense to me was a callback system. Clients would register an interest on an object in the system and when that object changed, they would get notified by the method of their choosing.

One thing about Watches in Zookeeper that annoys me is that they're one-shot deals. If you register a watch on a znode, once that watch is triggered, you have to REREGISTER the watch. First off this creates, as documented by the ZK project, a window of opportunity where you could miss another change to that watch. Let's assume you aren't using a language where interacting with Zookeeper is a synchronous process:

  • Connect to ZK
  • Register watch on znode
  • Wait
  • Change happens
  • Watch fires
  • Process watch event
  • Reregister watch on znode

In between those last two steps, you risk missing activity on the znode. In the Noah world, watches are persistent. This makes sense for two reasons. The first is that the latency between a watch callback being fired and proccessed could be much higher than the persistent connection in ZK. The window of missed messages is simply much greater. We could easily be talking 100's of milliseconds of latency just to get the message and more so to reregister the watch.

Secondly, the registration of Watches in Noah is, by nature of Noah's design and as a byproduct, disconnected from the consumer of those watches. This offers much greater flexibility in what watches can do. Let's look at a few examples.

First off, it's important to understand how Noah handles callbacks. The message format of a callback in Noah is simply a JSON representation of the changed state of an object and some metadata about the action taken (i.e. delete, create, update). Watches can be registered on distinct objects, a given path (and thus all the children under that path) and further refined down to a given action. Out of the box, Noah ships with one callback handler - http. This means that when you register a watch on a path or object, you provide an http endpoint where Noah can post the aforementioned JSON message. What you do with it from there is up to you.

By virtue of the above, the callback system is also designed to be 'pluggable' for lack of a better word. While the out of the box experience is an http post, you could easily write a callback handler that posted the message to an AMQP exchange or wrote the information to disk as a flat file. The only requirement is that you represent the callback location as a single string. The string will be parsed as a url and broken down into tokens that determine which plugin to call.

So this system allows for you to distribute watches to multiple systems with a single callback. Interestingly enough, this same watch callback system forms the basis of how Noah servers will share changes with each other in the future.

Wrap up - Part 1

So wrapping up what I've discussed, here are the key take aways:

  • Noah is a 'port' of specific Zookeeper functionality to a disconnected and asynchronous world
  • Noah uses HTTP and JSON as the interface to the server
  • Noah has both traditional ZK-style Ephemerals as well as opinionated Primitives
  • Noah uses a pluggable callback system to approximate the Watch functionality in Zookeeper
  • Clients can be written in any language that can speak HTTP and understand JSON (yes, even a shell script)

Part 2 and beyond

In part two of this series we'll discuss some of the additions to Noah that aren't a part of Zookeeper such as Tags and Links. Part 3 will cover the underlying technology which I am intentionally not discussing at this point. Part 4 will be a roadmap of my future plans for Noah.