Automatic Load Balancing Of Spring Boot Micro Services In Docker With Haystack

In Spring Boot Micro Services In Docker we saw how to create a Docker container from a Spring Boot Micro Service. In this article we shall look at load balancing the service using Haystack. Haystack is a DNS based load balancer that is integrated with the Docker API automatically creating service groups as containers stop and start.

Haystack monitors the Docker API using a tcp socket. In this case the Docker API is on port 2375 on the Docker host.

docker run \
  -e DOCKER_HOST=tcp://172.16.1.218:2375 \
  --name haystack \
   --detach \
   shortishly/haystack

Build a gsa image following the instructions in this article, starting 3 instances of our Spring Boot Micro Service:

docker run --name srv-001 -d gsa
docker run --name srv-002 -d gsa
docker run --name srv-003 -d gsa

We can confirm that we have 3 gsa services by running docker ps as follows:

$ docker ps -a --format="{{.Names}}"

srv-003
srv-002
srv-001

Haystack automatically creates a service group srv.gsa.services.haystack in DNS. Starting a new gsa container will automatically add it to the service group. Stopping a gsa container automatically will remove it from the service group.

Startup a busy box instance using Haystack’s embedded DNS:

docker run \
  --dns=$(docker inspect --format='{{.NetworkSettings.IPAddress}}' haystack) \
  --tty \
  --interactive \
  --rm busybox /bin/sh

Lookup srv.gsa.services.haystack in DNS:

nslookup srv.gsa.services.haystack

Server:    172.17.0.7
Address 1: 172.17.0.7 cb1i9a6.containers.haystack

Name:      srv.gsa.services.haystack
Address 1: 172.17.0.7 cb1i9a6.containers.haystack

Note that srv.gsa.services.haystack is actually pointing to the Haystack container. This is because Haystack acts as a proxy to HTTP requests, automatically load balancing requests randomly over members of the service group. Issue a wget to the service group and the requests will be load balanced randomly over the members:

# wget -q -O /dev/stdout http://srv.gsa.services.haystack/hello-world
{"id":1,"content":"Hello, Stranger!"}

# wget -q -O /dev/stdout http://srv.gsa.services.haystack/hello-world
{"id":1,"content":"Hello, Stranger!"}

# wget -q -O /dev/stdout http://srv.gsa.services.haystack/hello-world
{"id":2,"content":"Hello, Stranger!"}

# wget -q -O /dev/stdout http://srv.gsa.services.haystack/hello-world
{"id":1,"content":"Hello, Stranger!"}

# wget -q -O /dev/stdout http://srv.gsa.services.haystack/hello-world
{"id":3,"content":"Hello, Stranger!"}

# wget -q -O /dev/stdout http://srv.gsa.services.haystack/hello-world
{"id":2,"content":"Hello, Stranger!"}

# wget -q -O /dev/stdout http://srv.gsa.services.haystack/hello-world
{"id":4,"content":"Hello, Stranger!"}

# wget -q -O /dev/stdout http://srv.gsa.services.haystack/hello-world
{"id":3,"content":"Hello, Stranger!"}

# wget -q -O /dev/stdout http://srv.gsa.services.haystack/hello-world
{"id":4,"content":"Hello, Stranger!"}

# wget -q -O /dev/stdout http://srv.gsa.services.haystack/hello-world
{"id":5,"content":"Hello, Stranger!"}

Congratulations! You now have a Spring Boot Micro Service Docker container that is automatically being load balanced by Haystack. You can add or remove further gsa services and Haystack will automatically update the service pool.

Docker Service Discovery and Load Balancing with Haystack

Haystack provides service discovery and automatic HTTP load balancing in a Docker environment. It uses its own DNS to provide service discovery and dynamically updates its embedded load balancer as services are started or stopped.

Haystack solves two problems:

  • Service Discovery – Haystack automatically manages service registration within DNS by monitoring the Docker container lifecycle. Haystack introspects container metadata exposing available services within its own DNS. Haystack automatically manages the registration of services into a common DNS for discovery by other services.
  • Load balancing of a service as it is scaled up or down to meet demand – Haystack registers DNS service names that are load balanced over the available instances as the service is scaled up or down to meet demand. Each service name is registered in Haystack’s own DNS and Haystack manages the load balancing of the containers providing that service, from one to thousands and back again – dynamically as demand rises or falls.

This post walks through a demo of some of capabilities that Haystack provides using a simple HTTP based micro service. The service accepts HTTP GET requests and responds with the hostname of the container and the HTTP path used in the request.

To start a number of services to demonstrate load balancing in Haystack:

for i in {1..5}; do
     docker run \
     --name demo-$(printf %03d $i) \
     -d shortishly/haystack_demo;
done

Start Haystack – replace 172.16.1.218 with the location of your Docker Engine that is already listening on a tcp port.

docker run \
    -p 8080:80 \
    -e DOCKER_HOST=tcp://172.16.1.218:2375 \
    -d \
    --name haystack \
    shortishly/haystack

You should now have Haystack and 5 demo micro services running within Docker:

# docker ps --format="{{.Names}}"

haystack
demo-001
demo-002
demo-003
demo-004
demo-005

Now start a busybox that uses Haystack for DNS resolution as follows:

  docker run \
  --dns=$(docker inspect --format='{{.NetworkSettings.IPAddress}}' haystack) \
  --tty \
  --interactive \
  --rm busybox /bin/sh

In the busybox shell:

wget \
   -q \
   -O /dev/stdout \
    http://demo.haystack_demo.services.haystack/this/is/a/demo

Haystack has registered demo.haystack_demo.services.haystack in
its own DNS service, and is load balancing any
request randomly to one of the demo-001, demo-002,
demo-003, demo-003, demo-004 or demo-005
containers.

If you make a number of wget requests to the same URL you will
get responses from the different containers at random:

# wget -q -O /dev/stdout http://demo.haystack_demo.services.haystack/load/balancing
d617596e70da: /load/balancing
# wget -q -O /dev/stdout http://demo.haystack_demo.services.haystack/load/balancing
96c6c6f27f03: /load/balancing
# wget -q -O /dev/stdout http://demo.haystack_demo.services.haystack/load/balancing
296b5208edf9: /load/balancing
# wget -q -O /dev/stdout http://demo.haystack_demo.services.haystack/load/balancing
1166e110e70d: /load/balancing
# wget -q -O /dev/stdout http://demo.haystack_demo.services.haystack/load/balancing
9909343b937a: /load/balancing

You can verify which services are available in Haystack by curling /api/info:

curl -s http://$(docker inspect --format='{{.NetworkSettings.IPAddress}}' haystack)/api/info|python -m json.tool

Use you bowser to visit the Haystack UI running on your Docker host:

http://YOUR_DOCKER_HOST:8080/ui

A needle in a…

Just about everyone is using Docker. As a developer it radically simplifies how I package my code. However, in a tiered architecture, how is my presentation layer load balanced over my API layer? Out of the box Docker linking is to a single container and doesn’t provide load balancing. I’d like to introduce Haystack which provides automatic service discovery and load balancing of HTTP or WebSocket based services composed on Docker containers from a single node to a Swarm.

High Level Architecture
High Level Architecture

Haystack does this by monitoring the event stream of a single node or Swarm, noticing when containers start or stop (gracefully or otherwise). On startup a container is registered in Haystack’s DNS and any HTTP or WebSocket endpoints are automatically added to its load balancing proxy. Later when that same container stops, Haystack automatically removes it from the load balancer dropping its entry from DNS. Containers are organised into groups, partitioning services into tiers which may then be load balanced over all nodes with an Overlay network.

Haystack is itself a docker container which may be run on a single node or on every node in a swarm providing load balancing that is local to the node, or to other nodes in the cluster. We can create a simple Haystack environment using Docker Machine as follows:

# create a new docker machine environment called "default"
docker-machine create --driver virtualbox default

# use that environment in this shell
eval $(docker-machine env default)

# start Haystack connecting it to the Docker daemon
docker run -e DOCKER_HOST=${DOCKER_HOST} \
  -e DOCKER_KEY="$(cat ${DOCKER_CERT_PATH}/key.pem)" \
  -e DOCKER_CERT="$(cat ${DOCKER_CERT_PATH}/cert.pem)" \
  -e SHELLY_AUTHORIZED_KEYS="$(cat ~/.ssh/authorized_keys)" \
  -e DNS_NAMESERVERS=8.8.8.8:8.8.4.4 \
  --name=haystack \
  --publish=53:53/udp \
  --publish=80:80 \
  --publish=8080:8080 \
  --publish=22022:22 \
  --detach \
  shortishly/haystack

Haystack has 3 blocks of environmental configuration in the above command:

  • Lines 8 through 10 using DOCKER_HOST, DOCKER_KEY and DOCKER_CERT are enabling Haystack to communicate with the Docker Daemon using TLS.
  • The line containing SHELLY_AUTHORIZED_KEYS is optional. It copies your public keys into the Haystack container so that you can SSH directly into Haystack to perform maintenance or run tracing if things aren’t working OK. If you’re familiar with the Erlang Shell you can do so right now with ssh -p 22022 $(docker-machine ip default).
  • Finally the DNS_NAMESERVERS tells Haystack where to delegate any unresolved names. In the above example we’re using Google’s Public DNS. You can replace these entries with your own DNS if you wish.

Lets start a couple of nginx servers to be our web tier connecting them to the Haystack DNS service:

docker run \
  --dns=$(docker inspect --format='{{.NetworkSettings.IPAddress}}' haystack) \
  --detach nginx

docker run \
  --dns=$(docker inspect --format='{{.NetworkSettings.IPAddress}}' haystack) \
  --detach nginx

The --dns options are telling the Nginx docker container to resolve its DNS using the Haystack container.

Haystack maintains a DNS SRV record for each endpoint exposed by a container. For example, starting an Nginx container will automatically populate an appropriate service record into Haystack’s DNS using the IP address and port exposed by that container. In addition, a DNS A record for that service is created which points to Haystack’s Load Balancer.

# nslookup -query=srv _http._tcp.nginx.services.haystack $(docker-machine ip default)

_http._tcp.nginx.services.haystack service = 100 100 80 c9pcjp7.containers.haystack.
_http._tcp.nginx.services.haystack	service = 100 100 80 c5162p7.containers.haystack.

# dig -query=a nginx.services.haystack $(docker-machine ip default)
nginx.services.haystack. 100	IN	A	172.17.0.2

Haystack has automatically created a DNS entry called nginx.services.haystack which is pointing to its own internal load balancing proxy. We can curl that address and the HTTP or WebSocket requests will be load balanced over the available Nginx instances:

docker run \
  --dns=$(docker inspect --format='{{.NetworkSettings.IPAddress}}' haystack) \
  --tty --interactive --rm fedora /bin/bash

[root@a268fc00929d /]# curl http://nginx.services.haystack
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>

The DNS hierarchy created by Haystack is as follows:

Haystack DNS Hierarchy

Services that are available to be load balanced are published under the services.haystack namespace (for example, nginx.services.haystack). Haystack internally uses SRV records for each service instance, which Munchausen (the load balancing proxy) also references. Each docker container is also registered with a private Haystack name under the containers.haystack namespace.