Swarm cluster operation procedures¶
Cluster initialisation¶
Note
In most of cases, there is no need to initialse another cluster.
Before there is anything, a cluster should be initialised. Simply run the command below on a docker node to initialise a new cluster:
$ docker swarm init
Force a new cluster¶
In case the quorum of the cluster is lost (and you are not able to bring other manager nodes online again), you need to reinitiate a new cluster forcefully. This can be done on one of the remaining manager node using the following command:
$ docker swarm init --force-new-cluster
After this command is issued, a new cluster is created with only one manager (i.e. the one on which you issued the command). All remaining nodes become workers. You will have to add additional manager nodes manually.
Tip
Depending on the number of managers in the cluster, the required quorum (and thus the level of fail tolerance) is different. Check this page for more information.
Node operation¶
System provisioning¶
The operating system and the docker engine on the node is provisioned using the DCCN linux-server kickstart. The following kickstart files are used:
/mnt/install/kickstart-*/ks-*-dccn-dk.cfg
: the main kickstart configuration file/mnt/install/kickstart-*/postkit-dccn-dk/script-selection
: main script to trigger post-kickstart scripts/mnt/install/kickstart-*/setup-docker-*
: the docker-specific post-kickstart scripts
Configure devicemapper to direct-lvm mode
By default, the devicemapper storage drive of docker is running the loop-lvm mode which is known to be suboptimal for performance. In a production environment, the direct-lvm mode is recommended. How to configure the devicemapper to use direct-lvm mode is described here.
Before configuring the direct-lvm mode for the devicemapper, make sure the directory /var/lib/docker is removed. Also make sure the physical volume, volume group, logical volumes are removed, e.g.
$ lvremove /dev/docker/thinpool $ lvremove /dev/docker/thinpoolmeta $ vgremove docker $ pvremove /dev/sdbHereafter is a script summarizing the all steps. The script is also available at
/mnt/install/kickstart-7/docker/docker-thinpool.sh
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 #!/bin/bash if [ $# -ne 1 ]; then echo "USAGE: $0 <device>" exit 1 fi # get raw device path (e.g. /dev/sdb) from the command-line argument device=$1 # check if the device is available file -s ${device} | grep 'cannot open' if [ $? -eq 0 ]; then echo "device not found: ${device}" exit 1 fi # install/update the LVM package yum install -y lvm2 # create a physical volume on device pvcreate ${device} # create a volume group called 'docker' vgcreate docker ${device} # create logical volumes within the 'docker' volume group: one for data, one for metadate # assign volume size with respect to the size of the volume group lvcreate --wipesignatures y -n thinpool docker -l 95%VG lvcreate --wipesignatures y -n thinpoolmeta docker -l 1%VG lvconvert -y --zero n -c 512K --thinpool docker/thinpool --poolmetadata docker/thinpoolmeta # update the lvm profile for volume autoextend cat >/etc/lvm/profile/docker-thinpool.profile <<EOL activation { thin_pool_autoextend_threshold=80 thin_pool_autoextend_percent=20 } EOL # apply lvm profile lvchange --metadataprofile docker-thinpool docker/thinpool lvs -o+seg_monitor # create daemon.json file to instruct docker using the created logical volumes cat >/etc/docker/daemon.json <<EOL { "hosts": ["unix:///var/run/docker.sock", "tcp://0.0.0.0:2375"], "storage-driver": "devicemapper", "storage-opts": [ "dm.thinpooldev=/dev/mapper/docker-thinpool", "dm.use_deferred_removal=true", "dm.use_deferred_deletion=true" ] } EOL # remove legacy deamon configuration through docker.service.d to avoid confliction with daemon.json if [ -f /etc/systemd/system/docker.service.d/swarm.conf ]; then mv /etc/systemd/system/docker.service.d/swarm.conf /etc/systemd/system/docker.service.d/swarm.conf.bk fi # reload daemon configuration systemctl daemon-reload
Join the cluster¶
After the docker daemon is started, the node should be joined to the cluster. The command used to join the cluster can be retrieved from one of the manager node, using the command:
$ docker swarm join-token manager
Note
The example command above obtains the command for joining the cluster as a manager node. For joining the cluster as a worker, replace the manager
on the command with worker
.
After the command is retrieved, it should be run on the node that is about to join to the cluster.
Set Node label¶
Node label helps group nodes in certain features. Currently, the node in production is labled with function=production
using the following command:
$ docker node update --label-add function=production <NodeName>
When deploying a service or stack, the label is used for locate service tasks.
Leave the cluster¶
Run the following command on the node that is about to leave the cluster.
$ docker swarm leave
If the node is a manager, the option -f
(or --force
) should also be used in the command.
Note
The node leaves the cluster is NOT removed automatically from the node table. Instead, the node is marked as Down
. If you want the node to be removed from the table, you should run the command docker node rm
.
Tip
An alternative way to remove a node from the cluster directly is to run the docker node rm
command on a manager node.
Promote and demote node¶
Node in the cluster can be demoted (from manager to worker) or promoted (from worker to manager). This is done by using the command:
$ docker node promote <WorkerNodeName>
$ docker node demote <ManagerNodeName>
Monitor nodes¶
To list all nodes in the cluster, do
$ docker node ls
To inspect a node, do
$ docker node inspect <NodeName>
To list tasks running on a node, do
$ docker node ps <NodeName>
Service operation¶
In swarm cluster, a service is created by deploying a container in the cluster. The container can be deployed as a singel instance (i.e. task) or multiple instances to achieve service failover and load-balancing.
Start a service¶
To start a service in the cluster, one uses the docker service create
command. Hereafter is an example for starting a nginx
web service in the cluster using the container image docker-registry.dccn.nl:5000/nginx:1.0.0
:
1 2 3 4 5 6 7 8 9 | $ docker login docker-registry.dccn.nl:5000
$ docker service create \
--name webapp-proxy \
--replicas 2 \
--publish 8080:80/tcp \
--constaint "node.labels.function == production" \
--mount "type=bind,source=/mnt/docker/webapp-proxy/conf,target=/etc/nginx/conf.d" \
--with-registry-auth \
docker-registry.dccn.nl:5000/nginx:1.0.0
|
Options used above is explained in the following table:
option | function |
---|---|
--name |
set the service name to webapp-proxy |
--replicas |
deploy 2 tasks in the cluster for failover and loadbalance |
--publish |
map internal tcp port 80 to 8080 , and expose it to the world |
--constaint |
restrict the tasks to run on nodes labled with function = production |
--mount |
mount host’s /mnt/docker/webapp-proxy/conf to container’s /etc/nginx/conf.d |
More options can be found here.
Remove a service¶
Simply use the docker service rm <ServiceName>
to remove a running service in the cluster. It is not normal to remove a productional service.
Tip
In most of cases, you should consider updating the service rather than removing it.
Update a service¶
It is very common to update a productional service. Think about the following conditions that you will need to update the service:
- a new node is being added to the cluster, and you want to move an running service on it, or
- a new container image is being provided (e.g. software update or configuration changes) and you want to update the service to this new version, or
- you want to create more tasks of the service in the cluster to distribute the load.
To update a service, one uses the command docker service update
. The following example update the webapp-proxy
service to use a new version of nginx image docker-registry.dccn.nl:5000/nginx:1.2.0
:
$ docker service update \
--image docker-registry.dccn.nl:5000/nginx:1.2.0 \
webapp-proxy
More options can be found here.
Monitor services¶
To list all running services:
$ docker service ls
To list tasks of a service:
$ docker service ps <ServieName>
To inspect a service:
$ docker service inspect <ServiceName>
To retrieve logs written to the STDOU/STDERR by the service process, one could do:
$ docker service logs [-f] <ServiceName>
where the option -f
is used to follow the output.
Stack operation¶
A stack is usually defined as a group of related services. The defintion is described using the docker-compose version 3 specification.
Here is an example of defining the three services of the DCCN data-stager.
Using the docker stack
command you can manage multiple services in one consistent manner.
Deploy (update) a stack¶
Assuming the docker-compose file is called docker-compose.yml
, to launch the services defined in it in the swarm cluster is:
$ docker login docker-registry.dccn.nl:5000
$ docker stack deploy -c docker-compose.yml --with-registry-auth <StackName>
When there is an update in the stack description file (e.g. docker-compose.yml
), one can use the same command to apply changes on the running stack.
Note
Every stack will be created with an overlay network in swarm, and organise services within the network. The name of the network is <StackName>_default
.
Remove a stack¶
Use the following command to remove a stack from the cluster:
$ docker stack rm <StackName>
Monitor stacks¶
To list all running stacks:
$ docker stack ls
To list all services in a stack:
$ docker stack services <StackName>
To list all tasks of the services in a stack:
$ docker stack ps <StackName>
Emergancy shutdown¶
Note
The emergency shutdown should take place before the network and the central storage are down.
- login to one manager
- demote other managers
- remove running stacks and services
- shutdown all workers
- shutdown the manager
Reboot from shutdown¶
Note
In several network outage in 2017 and 2018, the cluster nodes were not reacheable and required hard (i.e. push the power button) to reboot. In this case, the emergancy shutdown procedure was not followed. Interestingly, the cluster was recovered automatically after sufficient amount of master nodes became online. All services were also re-deployed immediately without any human intervention.
One thing to notice is that if the network outage causes the NFS mount to /mnt/docker
not accessible, one may need to reboot the machines once the network connectivity is recovered as they can be irresponsive due to the hanging NFS connections.
boot on the manager node (the last one being shutted down)
boot on other nodes
promote nodes until a desired number of managers is reached
deploy firstly the docker-registry stack
$ cd /mnt/docker/scripts/microservices/registry/ $ sudo ./start.sh
Note
The docker-registry stack should be firstly made available as other services/stacks will need to pull container images from it.
deploy other stacks and services
Disaster recovery¶
Hopefully there is no need to go though it!!
For the moment, we are not backing up the state of the swarm cluster. Given that the container data has been stored (and backedup) on the central storage, the impact of losing a cluster is not dramatic (as long as the container data is available, it is already possible to restart all services on a fresh new cluster).
Nevertheless, here is the official instruction of disaster recovery.