2018-01-07 14:54:03 +00:00
# Clustering / High Availability on Docker Swarm with Consul
This guide explains how to use Træfik in high availability mode in a Docker Swarm and with Let's Encrypt.
Why do we need Træfik in cluster mode? Running multiple instances should work out of the box?
If you want to use Let's Encrypt with Træfik, sharing configuration or TLS certificates between many Træfik instances, you need Træfik cluster/HA.
Ok, could we mount a shared volume used by all my instances? Yes, you can, but it will not work.
When you use Let's Encrypt, you need to store certificates, but not only.
When Træfik generates a new certificate, it configures a challenge and once Let's Encrypt will verify the ownership of the domain, it will ping back the challenge.
2018-06-11 14:26:03 +00:00
If the challenge is not known by other Træfik instances, the validation will fail.
2018-01-07 14:54:03 +00:00
2018-06-11 14:26:03 +00:00
For more information about the challenge: [Automatic Certificate Management Environment (ACME) ](https://github.com/ietf-wg-acme/acme/blob/master/draft-ietf-acme-acme.md#http-challenge )
2018-01-07 14:54:03 +00:00
## Prerequisites
You will need a working Docker Swarm cluster.
## Træfik configuration
In this guide, we will not use a TOML configuration file, but only command line flag.
With that, we can use the base image without mounting configuration file or building custom image.
What Træfik should do:
- Listen to 80 and 443
- Redirect HTTP traffic to HTTPS
- Generate SSL certificate when a domain is added
- Listen to Docker Swarm event
### EntryPoints configuration
TL;DR:
```shell
$ traefik \
2018-02-05 07:54:03 +00:00
--entrypoints='Name:http Address::80 Redirect.EntryPoint:https' \
--entrypoints='Name:https Address::443 TLS' \
2018-01-07 14:54:03 +00:00
--defaultentrypoints=http,https
```
To listen to different ports, we need to create an entry point for each.
2018-02-05 07:54:03 +00:00
The CLI syntax is `--entrypoints='Name:a_name Address:an_ip_or_empty:a_port options'` .
2018-01-07 14:54:03 +00:00
If you want to redirect traffic from one entry point to another, it's the option `Redirect.EntryPoint:entrypoint_name` .
By default, we don't want to configure all our services to listen on http and https, we add a default entry point configuration: `--defaultentrypoints=http,https` .
### Let's Encrypt configuration
TL;DR:
```shell
$ traefik \
--acme \
--acme.storage=/etc/traefik/acme/acme.json \
--acme.entryPoint=https \
2018-01-15 15:04:05 +00:00
--acme.httpChallenge.entryPoint=http \
2018-01-07 14:54:03 +00:00
--acme.email=contact@mydomain.ca
```
2018-01-15 15:04:05 +00:00
Let's Encrypt needs 4 parameters: an TLS entry point to listen to, a non-TLS entry point to allow HTTP challenges, a storage for certificates, and an email for the registration.
2018-01-07 14:54:03 +00:00
To enable Let's Encrypt support, you need to add `--acme` flag.
Now, Træfik needs to know where to store the certificates, we can choose between a key in a Key-Value store, or a file path: `--acme.storage=my/key` or `--acme.storage=/path/to/acme.json` .
2018-01-15 15:04:05 +00:00
The `acme.httpChallenge.entryPoint` flag enables the `HTTP-01` challenge and specifies the entryPoint to use during the challenges.
2018-01-07 14:54:03 +00:00
For your email and the entry point, it's `--acme.entryPoint` and `--acme.email` flags.
### Docker configuration
TL;DR:
```shell
$ traefik \
--docker \
2018-04-06 07:38:03 +00:00
--docker.swarmMode \
2018-01-07 14:54:03 +00:00
--docker.domain=mydomain.ca \
--docker.watch
```
2018-04-06 07:38:03 +00:00
To enable docker and swarm-mode support, you need to add `--docker` and `--docker.swarmMode` flags.
2018-01-07 14:54:03 +00:00
To watch docker events, add `--docker.watch` .
### Full docker-compose file
```yaml
version: "3"
services:
traefik:
image: traefik:1.5
command:
2018-01-11 08:46:03 +00:00
- "--api"
2018-02-09 17:16:04 +00:00
- "--entrypoints=Name:http Address::80 Redirect.EntryPoint:https"
- "--entrypoints=Name:https Address::443 TLS"
2018-01-07 14:54:03 +00:00
- "--defaultentrypoints=http,https"
- "--acme"
- "--acme.storage=/etc/traefik/acme/acme.json"
- "--acme.entryPoint=https"
2018-01-15 15:04:05 +00:00
- "--acme.httpChallenge.entryPoint=http"
2018-04-06 07:38:03 +00:00
- "--acme.onHostRule=true"
2018-01-07 14:54:03 +00:00
- "--acme.onDemand=false"
- "--acme.email=contact@mydomain.ca"
- "--docker"
2018-04-06 07:38:03 +00:00
- "--docker.swarmMode"
2018-01-07 14:54:03 +00:00
- "--docker.domain=mydomain.ca"
- "--docker.watch"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
networks:
- webgateway
- traefik
ports:
- target: 80
published: 80
mode: host
- target: 443
published: 443
mode: host
- target: 8080
published: 8080
mode: host
deploy:
mode: global
placement:
constraints:
- node.role == manager
update_config:
parallelism: 1
delay: 10s
restart_policy:
condition: on-failure
networks:
webgateway:
driver: overlay
external: true
traefik:
driver: overlay
```
## Migrate configuration to Consul
We created a special Træfik command to help configuring your Key Value store from a Træfik TOML configuration file and/or CLI flags.
## Deploy a Træfik cluster
The best way we found is to have an initializer service.
This service will push the config to Consul via the `storeconfig` sub-command.
This service will retry until finishing without error because Consul may not be ready when the service tries to push the configuration.
The initializer in a docker-compose file will be:
```yaml
traefik_init:
image: traefik:1.5
command:
- "storeconfig"
2018-01-11 08:46:03 +00:00
- "--api"
2018-01-07 14:54:03 +00:00
[...]
- "--consul"
- "--consul.endpoint=consul:8500"
- "--consul.prefix=traefik"
networks:
- traefik
deploy:
restart_policy:
condition: on-failure
depends_on:
- consul
```
And now, the Træfik part will only have the Consul configuration.
```yaml
traefik:
image: traefik:1.5
depends_on:
- traefik_init
- consul
command:
- "--consul"
- "--consul.endpoint=consul:8500"
- "--consul.prefix=traefik"
[...]
```
!!! note
For Træfik < 1.5.0 add `acme.storage=traefik/acme/account` because Træfik is not reading it from Consul .
If you have some update to do, update the initializer service and re-deploy it.
The new configuration will be stored in Consul, and you need to restart the Træfik node: `docker service update --force traefik_traefik` .
## Full docker-compose file
```yaml
version: "3.4"
services:
traefik_init:
image: traefik:1.5
command:
- "storeconfig"
2018-01-11 08:46:03 +00:00
- "--api"
2018-02-09 17:16:04 +00:00
- "--entrypoints=Name:http Address::80 Redirect.EntryPoint:https"
- "--entrypoints=Name:https Address::443 TLS"
2018-01-07 14:54:03 +00:00
- "--defaultentrypoints=http,https"
- "--acme"
- "--acme.storage=traefik/acme/account"
- "--acme.entryPoint=https"
2018-01-15 15:04:05 +00:00
- "--acme.httpChallenge.entryPoint=http"
2018-04-06 07:38:03 +00:00
- "--acme.onHostRule=true"
2018-01-07 14:54:03 +00:00
- "--acme.onDemand=false"
2018-01-15 15:04:05 +00:00
- "--acme.email=foobar@example.com"
2018-01-07 14:54:03 +00:00
- "--docker"
2018-04-06 07:38:03 +00:00
- "--docker.swarmMode"
2018-01-15 15:04:05 +00:00
- "--docker.domain=example.com"
2018-01-07 14:54:03 +00:00
- "--docker.watch"
- "--consul"
- "--consul.endpoint=consul:8500"
- "--consul.prefix=traefik"
networks:
- traefik
deploy:
restart_policy:
condition: on-failure
depends_on:
- consul
traefik:
image: traefik:1.5
depends_on:
- traefik_init
- consul
command:
- "--consul"
- "--consul.endpoint=consul:8500"
- "--consul.prefix=traefik"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
networks:
- webgateway
- traefik
ports:
- target: 80
published: 80
mode: host
- target: 443
published: 443
mode: host
- target: 8080
published: 8080
mode: host
deploy:
mode: global
placement:
constraints:
- node.role == manager
update_config:
parallelism: 1
delay: 10s
restart_policy:
condition: on-failure
consul:
image: consul
command: agent -server -bootstrap-expect=1
volumes:
- consul-data:/consul/data
environment:
- CONSUL_LOCAL_CONFIG={"datacenter":"us_east2","server":true}
- CONSUL_BIND_INTERFACE=eth0
- CONSUL_CLIENT_INTERFACE=eth0
deploy:
replicas: 1
placement:
constraints:
- node.role == manager
restart_policy:
condition: on-failure
networks:
- traefik
networks:
webgateway:
driver: overlay
external: true
traefik:
driver: overlay
volumes:
consul-data:
driver: [not local]
```