Tansu is a distributed key/value and lock store

Tansu is a distributed key value store designed to maintain configuration and other data that must be highly available. It uses the Raft Consensus algorithm for leadership election and distribution of state amongst its members. By default node discovery is via mDNS and will automatically form a mesh of nodes sharing the same environment.

Features

Tansu has a REST interface to set, get or delete the value represented by a key. It also provides a HTTP Server Sent Event Stream
of changes to the store.

Tansu provides REST interface for simple Check And Set (CAS) operations.

Tansu provides test and set operations that can be used to operate locks through a simple REST based HTTP Server Sent Event Stream interface.

Quick Start

Tansu is packaged as a Docker container. To start a local 5 node cluster is as simple as:

for i in {1..5}; do
    docker run \
        --name tansu-$(printf %03d $i) \
        -d shortishly/tansu;
done

Tansu uses mDNS by default to discover other nodes and automatically forms a cluster. API requests can be made to any discovered node, which are internally routed to the appropriate node depending on the request type.

To demonstrate this, lets use the shell to randomly pick a node from our new cluster:

RANDOM_IP=$(docker inspect \
            --format={{.NetworkSettings.IPAddress}} \
            tansu-$(printf %03d $[1 + $[RANDOM % 5]]))

All key/value API operations are under the ‘/api/keys/…’ URL. We can create a stream of changes to a key (or a hierarchy of keys) before that key exists as follows:

curl \
    -i \
    -s \
    "http://${RANDOM_IP}/api/keys/hello?stream=true&children=true"

The key space in Tansu is a directory structure separated with ‘/’ characters. Any change to the key ‘hello’ will be reported in the above stream, and also any change in subdirectory below ‘hello’ will also be reported.

Leaving the stream curl running, in another shell lets assign the value “world” to the key “hello”:

curl \
    -X PUT \
    -i \
    -s \
    http://${RANDOM_IP}/api/keys/hello \
    -d value=world

Back in our stream, it will contain a ‘create’ notification:

id: 1
event: create
data: {
  "category":"user",
  "key":"/hello",
  "metadata":{
  "tansu":{
    "content_type":"text/plain",
    "created":1,
    "parent":"/",
    "updated":1}},
  "value":"world"}

Or a key that is below ‘hello’:

curl \
  -X PUT \
  -i \
  -s \
   http://${RANDOM_IP}/api/keys/hello/joe \
  -d value=mike

The stream will now contain a `create` notification:

id: 2
event: create
data: {
  "category":"user",
  "key":"/hello/joe",
  "metadata":{
    "tansu":{
      "content_type":"text/plain",
      "created":2,
      "parent":"/hello",
      "updated":2}},
  "value":"mike"}

In the above case Tansu will assume that the value has the ‘text/plain’ content type (as the value from a form url encoded body). Other content types (in particular JSON) are also supported:

curl \
  -X PUT \
  -H "Content-Type: application/json" \
  -i http://${RANDOM_IP}/api/keys/hello \
  --data-binary '{"stuff": true}'

With an update in the stream:

id: 3
event: set
data: {
  "category":"user",
  "key":"/hello",
  "metadata":{
    "tansu":{
      "content_type":"application/json",
      "created":1,
      "parent":"/",
      "updated":3}},
  "previous":"world",
  "value":{"stuff":true}}

GET

The current value of a key can be obtained simply by issuing a GET on that key:

curl \
  -i \
  -s \
  http://${RANDOM_IP}/api/keys/hello

{"stuff": true}

DELETE

Similarly a key is removed by issuing a DELETE request:

curl \
  -i \
  -X DELETE \
  http://${RANDOM_IP}/api/keys/hello

The stream will now contain a delete notification:

id: 5
event: delete
data: {
  "category":"user",
  "key":"/hello",
  "metadata":{
    "tansu":{
      "content_type":"application/json",
      "created":1,
      "parent":"/",
      "updated":5}},
  "value":{"stuff":true}}

TTL

A value can also be given a time to live by supplying a TTL header:

curl \
  -X PUT \
  -H "Content-Type: application/json" \
  -H "ttl: 10" \
  -i \
  http://${RANDOM_IP}/api/keys/hello \
  --data-binary '{"ephemeral": true}'

The event stream will contain details of the `create` together with a TTL
attribute:

id: 6
event: create
data: {
  "category":"user",
  "key":"/hello",
  "metadata":{
    "tansu":{
      "content_type":"application/json",
      "created":6,
      "parent":"/",
      "ttl":10,
      "updated":6}},
  "value":{"ephemeral":true}}

Ten seconds later when the time to live has expired:

id: 7
event: delete
data: {
  "category":"user",
  "key":"/hello",
  "metadata":{
    "tansu":{
      "content_type":"application/json",
      "created":6,
      "parent":"/",
      "ttl":0,
      "updated":7}},
  "value":{"ephemeral":true}}

A needle in a…

Just about everyone is using Docker. As a developer it radically simplifies how I package my code. However, in a tiered architecture, how is my presentation layer load balanced over my API layer? Out of the box Docker linking is to a single container and doesn’t provide load balancing. I’d like to introduce Haystack which provides automatic service discovery and load balancing of HTTP or WebSocket based services composed on Docker containers from a single node to a Swarm.

High Level Architecture
High Level Architecture

Haystack does this by monitoring the event stream of a single node or Swarm, noticing when containers start or stop (gracefully or otherwise). On startup a container is registered in Haystack’s DNS and any HTTP or WebSocket endpoints are automatically added to its load balancing proxy. Later when that same container stops, Haystack automatically removes it from the load balancer dropping its entry from DNS. Containers are organised into groups, partitioning services into tiers which may then be load balanced over all nodes with an Overlay network.

Haystack is itself a docker container which may be run on a single node or on every node in a swarm providing load balancing that is local to the node, or to other nodes in the cluster. We can create a simple Haystack environment using Docker Machine as follows:

# create a new docker machine environment called "default"
docker-machine create --driver virtualbox default

# use that environment in this shell
eval $(docker-machine env default)

# start Haystack connecting it to the Docker daemon
docker run -e DOCKER_HOST=${DOCKER_HOST} \
  -e DOCKER_KEY="$(cat ${DOCKER_CERT_PATH}/key.pem)" \
  -e DOCKER_CERT="$(cat ${DOCKER_CERT_PATH}/cert.pem)" \
  -e SHELLY_AUTHORIZED_KEYS="$(cat ~/.ssh/authorized_keys)" \
  -e DNS_NAMESERVERS=8.8.8.8:8.8.4.4 \
  --name=haystack \
  --publish=53:53/udp \
  --publish=80:80 \
  --publish=8080:8080 \
  --publish=22022:22 \
  --detach \
  shortishly/haystack

Haystack has 3 blocks of environmental configuration in the above command:

  • Lines 8 through 10 using DOCKER_HOST, DOCKER_KEY and DOCKER_CERT are enabling Haystack to communicate with the Docker Daemon using TLS.
  • The line containing SHELLY_AUTHORIZED_KEYS is optional. It copies your public keys into the Haystack container so that you can SSH directly into Haystack to perform maintenance or run tracing if things aren’t working OK. If you’re familiar with the Erlang Shell you can do so right now with ssh -p 22022 $(docker-machine ip default).
  • Finally the DNS_NAMESERVERS tells Haystack where to delegate any unresolved names. In the above example we’re using Google’s Public DNS. You can replace these entries with your own DNS if you wish.

Lets start a couple of nginx servers to be our web tier connecting them to the Haystack DNS service:

docker run \
  --dns=$(docker inspect --format='{{.NetworkSettings.IPAddress}}' haystack) \
  --detach nginx

docker run \
  --dns=$(docker inspect --format='{{.NetworkSettings.IPAddress}}' haystack) \
  --detach nginx

The --dns options are telling the Nginx docker container to resolve its DNS using the Haystack container.

Haystack maintains a DNS SRV record for each endpoint exposed by a container. For example, starting an Nginx container will automatically populate an appropriate service record into Haystack’s DNS using the IP address and port exposed by that container. In addition, a DNS A record for that service is created which points to Haystack’s Load Balancer.

# nslookup -query=srv _http._tcp.nginx.services.haystack $(docker-machine ip default)

_http._tcp.nginx.services.haystack service = 100 100 80 c9pcjp7.containers.haystack.
_http._tcp.nginx.services.haystack	service = 100 100 80 c5162p7.containers.haystack.

# dig -query=a nginx.services.haystack $(docker-machine ip default)
nginx.services.haystack. 100	IN	A	172.17.0.2

Haystack has automatically created a DNS entry called nginx.services.haystack which is pointing to its own internal load balancing proxy. We can curl that address and the HTTP or WebSocket requests will be load balanced over the available Nginx instances:

docker run \
  --dns=$(docker inspect --format='{{.NetworkSettings.IPAddress}}' haystack) \
  --tty --interactive --rm fedora /bin/bash

[root@a268fc00929d /]# curl http://nginx.services.haystack
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>

The DNS hierarchy created by Haystack is as follows:

Haystack DNS Hierarchy

Services that are available to be load balanced are published under the services.haystack namespace (for example, nginx.services.haystack). Haystack internally uses SRV records for each service instance, which Munchausen (the load balancing proxy) also references. Each docker container is also registered with a private Haystack name under the containers.haystack namespace.