Who need two different message queues?

2019-05-25

The current state of Ramen in a distributed setting was that it kind of work as expected (ie. small scale etc), as long as the configuration files are synchronized somehow amongst all the sites.

The plan was to use a distributed file system for that, for instance ZooKeeper. But quite frankly, the complexity of maintaining ZooKeeper in exchange for a small scale deployment is not a great deal. I also played with the idea of replacing all the configration files by a single, network accessible key-value store. Actually, the first prototype of Ramen worked this way.

The advantages of a single key-value store are many:

But there are some disadvantages as well:

This last one especially killed the idea and that's why the current version of Ramen uses plain and simple configuration files.

And I would still favor files if not for some crazy ideas that have landed in my mind recently: if we had a single network accessible pub-sub kind of data store, we could use it as well to retrieve not only the configuration but also the time series out of Ramen's workers. So we could implement a graphical user interface that would also subscribe to it, and subscribe to some worker output, and update the charts as new points are produced.

That's streaming from end to end, something that I find very desirable and that I wanted to do at some point.

Picture this typical situation: The devops team at ACME hosting corp is monitoring its server farm with a dashboard displaying 50 time series. This dashboard is displayed in 20 different clients (some devops desktops and some screens on the wall), each refreshing the display every minute. For the server, that's going to be 1000 queries every minutes to retrieve the last 100 points for all these time series (99 of which are going to be the same as one minute before).

No amount of HTTP caches is going to convince me that it's not a stupid waste of resources. Stream values down to the clients and instead of all those queries you just have to push 1000 floating point numbers in total every minutes with no need for any IO at all on the server.

For the most extreme cases with even more clients you could even do multicast.

So it looks like this key-value service would kill two birds with one bow: synchronizing the configuration for multi-site deployments and end to end streaming of time series.

Therefore I took my shopping basket under my arm and went on the market looking for a key-value store that would fulfill these requirements:

That's a lot to ask for, but at the same time I can't think of sharing a configuration tree amongst several writers without those features. With the many key-value stores that have popped up those last years, surely there are still plenty fitting that description.

Well, I came back with an empty basket.

Some of the options I quickly envisaged and eventually turned away from:

At that point it started to feel like it would be quicker to just implement what I need than to keep looking for it. How hard could it be to slam together openssl, a small in memory hash table, and a custom pub-sub protocol?

But first, I had to come to term with another idea that was periodically resurfacing: There is already a message queue within Ramen: the ringbuffers, that can now work across server boundaries. Do Ramen really need two message queues? Can't ringbuffers be used for the configuration and the streaming down to the clients ?

Several reasons why they definitively can't:

Also, I'd rather have the ringbuffer design be solely dictated by the need of the workers. The mechanism to pass data around from worker to worker has to be optimized for large and directed streams of arbitrary data; while distribution of the configuration parameters and even, to a lesser degree, of the last output value of a few leaf workers, represents only a small volume in comparison and does have completely different needs in terms of authentication etc.

Just another case of "one size does not fit all".

Then, for a reason I can't remember but maybe just because I've always had a soft spot in my mind for this library despite I've never used it, I came to think about ZeroMq.

Of course ZeroMQ is not a message queue (as the name implies). It is message agnostic; just a better socket library that does authentication, authorization (including the delegating part), crypto, and seams very internet worthy. On ACME LAN it would even do multicast. It would fit perfectly all "SHOULDs" but few of the "MUST".

Hopefully, the "MUST" are the easy parts to implement, also the parts that I'm happy to have 100% control over.

Next time I might write about how I went shopping again, this time for a GUI toolkit.