Using Ansible and Docker for home servers

Updated on .

I run two Raspberry Pi servers on my home network: one connected to hard drives to provide network-attached storage, and another to monitor services, energy, and water usage. In setting up the servers, I wanted to keep all of the configuration under version control so I could easily roll-back mistakes. Repeatability was also important in case I need to swap out one of the servers due to hardware failure. Finally, I wanted a system that was easy to pick up, as I tend to rotate between interests and might come back to the servers with no memory of how they work.

I ultimately 1 I tried and failed at using Nix for this. I could not get the SD card flasher to cross-compile a ZFS-capable Linux kernel from a Fedora laptop in any reasonable amount of time, so I gave up. It was glorious when it was working though and I hope to come back to it someday. chose Ansible for running tasks on each server, bringing in Docker for any long-running services. For instance I install ZFS on the storage server with Ansible, but run a service that monitors SMART statistics using Docker. While the two somewhat-competing systems add a bit more cognitive overhead to the setup, there are significant benefits from using both.

Rationale

Ansible isn’t perfect: the use of YAML for everything is a choice and its tasks aren’t idempotent in practice. It’s also not a configuration language – it’s a task runner – so changes may need to be rolled back manually if they start up a service or install a package. The work people have done with it however has a strong sense of pragmatism and I usually don’t need to care what language it’s written in. 2 I write (and review) enough Python at my day job, thank you very much. There are a lot of overlapping features so you need to use the correct subset of them to avoid pitfalls, like many organically-grown software packages.

Additionally, Docker paired with docker-compose files has enough mindshare and momentum that most services come with some kind of Dockerfile and possibly an image pre-built on Docker Hub or GitHub Container Registry. I’ve found it better to use containers from the creators of the service than to rely on third-party volunteer efforts for packaging software.

The “Docker-native” approach to orchestration would be using Kubernetes and Helm, but I’ve read that they aren’t designed for a heterogenous network with long-term storage like I have. 3 As I add more one-off services to my system that do small tasks, I may add k3s to my setup. There are ways to make it work, but it feels more appropriate for a cluster environment, where each node is interchangeable. I’m trying to go with the grain for the system integration here; I’ve spent enough of my youth trying to make software do things they weren’t designed to do.

Ansible

That’s the rationale for these choices, but how do you actually use Ansible to set up servers? The Ansible documentation has a lot of extra details that don’t matter and most of it is reference-style anyway. Actually solving problems with these systems is challenging thanks to the degrading quality of search results and SEO blog spam that dominates buzzword-heavy technologies. 4 I would be amazed if this note ranks anywhere in the top-dozen pages of results, unfortunately. You need some familiarity with YAML for the following descriptions to make sense, and apologies if this is your first exposure to the language.

Ansible needs to know what servers you want to work with, using an “inventory” file. An inventory file named inventory.yml could have the following contents:

group-a:
  hosts:
    host-a:
      ansible_host: 192.168.1.10

group-b:
  hosts:
    host-b:
      ansible_host: 192.168.1.11

group-c:
  children:
    group-a:
    group-b:

This sets up a hierarchy to group servers together, but I didn’t end up needing to use that for my simple server setup. You must provide the inventory to commands like ansible-playbook with either the --inventory option or by adding inventory = <file> to a file named ansible.cfg in the same directory that Ansible is running from.

Playbooks, tasks, and roles

Ansible runs “playbooks”, which are YAML files that describe actions to take on groups or hosts in your inventory using the ansible-playbook command. There’s no real naming scheme for them, but I chose to name my “set everything up and hope it’s idempotent” one main.yml. Mine looks sort of like this:

- hosts: all
  name: Updating apt Cache
  tasks:
    - name: Updating apt Cache
      become: true
      ansible.builtin.apt:
        update_cache: true
        cache_valid_time: 3600
      when:
        - ansible_facts.os_family == "Debian"

- hosts: host-a
  name: Setting up host-a
  roles:
    - role: docker
    - role: pihole

Each item in the top-level array is a set of sequential tasks or roles to apply to hosts. A task is the atomic unit of change in Ansible: it could be a command to run, an apt package to install, or a directory to synchronize with the host. There are a lot of built-in templates for tasks, but it’s apparently possible to write your own template. I never had to do that, though, and it sounds like it could involve Python, which I’m trying to avoid here. They have the following structure:

- name: <some-human-readable-description-of-whats-happening>
  when: <a-condition-for-whether-the-task-should-run>
  <template-name>:
    <template-variable>: <value>

There are other possibilities, like become (to run the task as root), but those are the main ones at the top-level task structure. The when field is there because tasks are supposed to be idempotent, that is, running them multiple times doesn’t duplicate an effect. Running a task when its end result is already in place should do nothing. This is really hard to achieve in practice but is a laudable goal. Some of the naming schemes are odd because of this, too. I don’t use when very much but I probably should go back and do another pass on my tasks to add it.

I only used half a dozen task templates (Ansible calls these “modules”) to set up the servers:

While it’s perfectly fine for playbooks to be just straight lists of tasks, and I started out doing that, at some point it’s easier to maintain when tasks can be split into multiple files. 5 For an example of how this can get out of hand, see the internet-pi project, which shares many of the goals of my setup. The repository has a top-level tasks directory that the main.yml playbook references using ansible.builtin.import_tasks. And every service’s template file is in the templates directory. I find it a lot nicer to colocate the same service’s files under a single directory. There’s a ansible.builtin.include_tasks template which I was using to great effect, but eventually my templates were spread across different directories and it got hard to manage.

That’s where “roles” come in: they package up related tasks and resources to achieve some result (like “get Grafana running”). My Grafana goal has the following directory structure:

ansible/roles/grafana/tasks/main.yml
ansible/roles/grafana/files/dashboards/services.json
ansible/roles/grafana/templates/docker-compose.yml.j2

The templates/docker-compose.yml.j2 file is a template that can be referenced in the src field by just its filename in a ansible.builtin.template module. tasks/main.yml holds the sequence of tasks that should be carried out by hosts that adopt this role. And files/* is a list of related files that can be copied to the host of referenced in tasks. All of the configuration and tasks that reference them are under the same directory.

Variables are passed to roles from the playbook like this:

- hosts: host-a
  name: Setting up host-a
  roles:
    - role: pihole
      vars:
        pihole_port: 12345

Dealing with secrets

Ansible ships with a system called Vault that can decrypt secrets that are stored as cipher text in variables. There’s a single password for each vault that’s used to encrypt other secrets. Otherwise there’s no setup for a “vault”, because it only needs a password. The ansible-vault command reads the secret on stdin(4) and prints out the string to use as the variable. Here’s an example:

ansible-vault encrypt_string --ask-vault-pass --name <name-of-secret>

This produces a string starting with !vault | followed by the encrypted secret. Ansible will automatically decrypt it and substitute it in playbooks when they’re run. I’m not sure the --name argument does anything, and it’s a little annoying to need to type in the vault password in each time. Ansible even allows the vault password to be supplied by another program. In my case I use 1Password to hold all of my secrets, so I added a script that just runs:

exec op item get 'Ansible Vault' --fields password

This pulls the password out of an entry called “Ansible Vault”. By setting the script as the vault_password_file in ansible.cfg where I run Ansible, I don’t have to type my vault password in each time I want to run a playbook or encrypt a secret. Putting it all together in my primary playbook, the roles typically look like this:

- hosts: my-pi
  name: Serving DNS
  roles:
    - role: pihole
      vars:
        pihole_password: !vault |
          $ANSIBLE_VAULT;1.1;AES256
          <cipher-text>

This makes the pihole_password secret available to use in the role. It’s also possible to store these in a group_vars or host_vars directory at the top-level, but I didn’t feel that was necessary.

Ansible tips

Docker

Most Docker guides describe the command line interface to running containers, but I exclusively used docker-compose. I’d much rather have the settings I apply to each container stored in a file than in my shell history. The compose file format is straightforward and is usually meant to bundle multiple Docker containers together in the same file, even though I typically specify a single service per file. You typically only need to do a few things with a container:

It’s probably best to just look at the most complex compose file I used in my home server:

{{ ansible_managed | comment }}

version: "3.5"
volumes:
  vmagentdata: {}
  vmdata: {}
networks:
  vm_net:
    name: vm_net
services:
  vmagent:
    container_name: vmagent
    image: victoriametrics/vmagent:v1.80.0
    depends_on:
      - "<other-service>"
    ports:
      - "{{ prometheus_port }}:8429
    extra_hosts:
      - "<this-host-name>:host-gateway"
      - "<other-host-name>:<ip>"
    volumes:
      - vmagentdata:/vmagentdata
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - ./file_sd:/etc/prometheus/file_sd
    command:
      - "--promscrape.config=/etc/prometheus/prometheus.yml"
      - "--remoteWrite.url=http://victoriametrics:8428/api/v1/write"
    networks:
      - vm_net
    restart: always

This is stored as docker-compose.yml.j2 in the templates directory of a VictoriaMetrics role. There’s a second service below this to actually run the metrics database: this snippet is just the Prometheus scraper. When the ansible.builtin.template module puts this on the device, it replaces {{ prometheus_port }} with a variable value I’ve defined in my playbook. Here are the Ansible tasks that do that:

- name: Copying VictoriaMetrics Docker Compose
  ansible.builtin.template:
    src: docker-compose.yml.j2
    dest: ~/victoriametrics/docker-compose.yml
  become: false

- name: Starting VictoriaMetrics
  community.docker.docker_compose:
    project_src: "~/victoriametrics/"
    build: false
    restarted: true
  become: false

I’m not sure if using ~/victoriametrics as the main directory for the container is a good idea, but it seems to be working for now.

Docker troubleshooting

Summary

Hopefully this note helped you put together your own home server in a way that’s easy to manage. I struggled for a long time with which configuration system to use, and eventually just did the simplest thing I could find that still let me check the steps into version control.‌‌ If you have any thoughts, don’t hesitate to reach out and email me at hello@mattwidmann.net.