Rapidly Build & Test Ansible Roles with Molecule + Docker


Percy Grunwald's Profile Picture

Written by Percy Grunwald

— Last Updated August 2, 2020

What is Ansible Molecule?

Molecule is a tool originally developed by retr0h and now maintained by Ansible/Red Hat that automates testing of Ansible roles. At its heart, Molecule is designed to automate all parts of role testing:

  1. Launching and preparing instances on which to test your role, with a number of different “drivers” for different infrastructure sources (e.g. Docker, Vagrant, EC2, etc.)
  2. Running your role against all the instances
  3. Testing that your role runs successfully and that the intended changes occurred on each instance
  4. Tearing down any infrastructure created in step 1 to leave everything clean
Ansible Molecule test sequence

Instead of a highly manual process, testing a role against any number of distros is as simple as running molecule test.

But what about developing roles in the first place?

When you’re developing roles, you often want the infrastructure to hang around after your playbook finishes running so you can determine why something failed, or so you can run the role again quickly after making changes. It turns out Molecule excels for this type of workflow as well.

In the following sections I’ll build a super simple redis role to show how you can use Molecule for both development and testing.

Video tutorial

If you prefer watching to reading, here’s a full video tutorial from the TopTechSkills YouTube channel covering a many of the points and examples from this article. Feel free to comment on this article or the video if you have any questions.

▶️ Watch on YouTube

Installing Molecule and docker-py

Assuming you already have Ansible and Docker installed and working on your system, getting up and running with Molecule is very quick and easy.

Molecule is available as a Python package and can be installed with pip. Since this example will be using Docker as the infrastructure source, the docker-py package is required as well:

$ pip install molecule docker

Initializing a new role with Molecule

You can create a new role by running molecule init role:

$ molecule init role --driver-name docker --role-name ansible-role-redis
--> Initializing new role ansible-role-redis...
Initialized role in /Users/percy/Code/ansible-role-redis successfully.

Molecule will create a new directory and fill it with Ansible and Molecule boilerplate:

ansible-role-redis
├── README.md
├── defaults
│   └── main.yml
├── handlers
│   └── main.yml
├── meta
│   └── main.yml
├── molecule
│   └── default
│       ├── Dockerfile.j2
│       ├── INSTALL.rst
│       ├── molecule.yml
│       ├── playbook.yml
│       └── tests
│           └── test_default.py
├── tasks
│   └── main.yml
└── vars
    └── main.yml

The Ansible role files are all empty and the molecule directory is full of the defaults, but it’s already possible to run all the molecule commands.

Testing the empty role

If you run molecule test right now and you should see output similar to this:

$ molecule test
--> Validating schema ...
Validation completed successfully.
--> Test matrix
    
└── default
    ├── lint
    ├── destroy
    ├── dependency
    ├── syntax
    ├── create
    ├── prepare
    ├── converge
    ├── idempotence
    ├── side_effect
    ├── verify
    └── destroy
...

The “test matrix” (or “test sequence”) shows everything Molecule will do to test the role:

  1. lint – run yamllint and ansible-lint on YAML files, and flake8 on the Python test files
  2. destroy – make sure that any infrastructure from previous tests is gone
  3. dependency – (optional) download any dependencies from Ansible Galaxy
  4. syntax – run ansible-playbook --syntax-check on the molecule/default/playbook.yml file
  5. create – create the instances using the configured driver (docker, ec2, vagrant, etc.)
  6. prepare – (optional) run a playbook to prepare the instances after create has finished
  7. converge – run molecule/default/playbook.yml on the infrastructure
  8. idempotence – run the playbook again to check that nothing is marked as changed
  9. side_effect – (optional) run a playbook that has side effects on the instance
  10. verify – run tests on the instances (testinfra is the default)
  11. destroy – tear down the infrastructure and clean up

The converge task is the most important part of the test matrix and you can see what this task is doing by looking at molecule/default/playbook.yml:

# molecule/default/playbook.yml
---
- name: Converge
  hosts: all
  roles:
    - role: ansible-role-redis

molecule test may take a few minutes to run the first time because it needs to pull and build a Docker image, but once that’s done everything else should finish quickly without any errors:

...
--> Action: 'converge'
    
    PLAY [Converge] **************************************************************
    
    TASK [Gathering Facts] *******************************************************
    ok: [instance]
    
    PLAY RECAP *******************************************************************
    instance                   : ok=1    changed=0    unreachable=0    failed=0
...
--> Executing Testinfra tests found in .../molecule/default/tests/...
    ============================= test session starts ============================
...
    ========================== 1 passed in 10.46 seconds =========================
Verifier completed successfully.
...

Role development workflow

It’s great that molecule test automates so much, but it seems better suited to testing a role once it’s complete rather than actually developing the role. You generally want your role development workflow to allow for the fastest possible iterations, ideally by re-running the converge task without needing to create/destroy the instances every time.

To enable a development workflow like this, all you need to do is switch to running molecule converge instead of molecule test:

$ molecule converge
...
--> Test matrix
    
└── default
    ├── dependency
    ├── create
    ├── prepare
    └── converge
...

As you can see in the output above, the test matrix for molecule converge skips most of the tasks and does not bother with destroy, which will leave the instances running after applying the converge playbook. Although create and prepare tasks are included, Molecule automatically skips them if it detects that there are existing instances.

In practice, this means that subsequent runs on molecule converge only take as long as the converge task:

# first run (16.62s)
$ /usr/bin/time -p molecule converge
... 
real        16.62
user         5.59
sys          2.08
# second run (5.44s)
$ /usr/bin/time -p molecule converge
...
--> Action: 'create'
Skipping, instances already created.
...
real         5.44
user         2.16
sys          0.79

The role development workflow becomes:

  1. Make some changes to the role
  2. Run molecule converge

This 1-2 has extremely fast feedback and speeds up your development a great deal. Let’s try this workflow now by making a change to the role and check the results with molecule converge.

Make a change to the role

The role is totally empty at the moment, so I’ll add a task to tasks/main.yml that prints the distribution and version of the OS:

# tasks/main.yml
---
- name: print the distribution and version
  debug: msg="{{ ansible_distribution }} {{ ansible_distribution_version }}"

Check the results with molecule converge

Running molecule converge now gives me feedback on the change within a few seconds:

$ /usr/bin/time -p molecule converge
...
--> Scenario: 'default'
--> Action: 'converge'
    
    PLAY [Converge] **************************************************************
    
    TASK [Gathering Facts] *******************************************************
    ok: [instance]
    
    TASK [ansible-role-redis : print the distribution and version] ***************
    ok: [instance] => {
        "msg": "CentOS 7.6.1810"
    }
    
    PLAY RECAP *******************************************************************
    instance                   : ok=2    changed=0    unreachable=0    failed=0
    
    
real         6.94
user         2.52
sys          1.23

If at any time you want to start with fresh instances, it’s simply a matter of running molecule destroy before running molecule converge again.

Testing a role against multiple platforms

The output of the debug message above shows that the single instance is running Centos 7, which is set by platforms in molecule/default/molecule.yml:

# molecule/default/molecule.yml
...
platforms:
  - name: instance
    image: centos:7
...

If you’re developing roles for public consumption (e.g. for Ansible Galaxy) or you manage infrastructure running different operating systems, you typically want to make your roles as cross-compatible as possible. The best way to do this is to develop and test your role against the operating systems your role should explicitly support.

Molecule makes this extremely easy: all you need to do is add more items to platforms in molecule/default/molecule.yml. In addition to CentOS 7, I want my role to support Ubuntu 18.04 (Bionic) and Debian 9 (Stretch), so I’ll go ahead and add the official base images to platforms:

# molecule/default/molecule.yml
...
platforms:
  - name: centos7
    image: centos:7
  - name: ubuntu1804
    image: ubuntu:18.04
  - name: debian9
    image: debian:9
...

Since there have been changes to platforms, make sure to run molecule destroy to reset the infrastructure before running molecule converge again:

$ molecule destroy 
...

$ molecule converge
...
--> Scenario: 'default'
--> Action: 'converge'
    
    PLAY [Converge] ****************************************************************
    
    TASK [Gathering Facts] *********************************************************
    ok: [ubuntu1804]
    ok: [centos7]
    ok: [debian9]
    
    TASK [ansible-role-redis : print the distribution and version] *****************
    ok: [centos7] => {
        "msg": "CentOS 7.6.1810"
    }
    ok: [ubuntu1804] => {
        "msg": "Ubuntu 18.04"
    }
    ok: [debian9] => {
        "msg": "Debian 9.7"
    }
    
    PLAY RECAP *********************************************************************
    centos7                    : ok=2    changed=0    unreachable=0    failed=0
    debian9                    : ok=2    changed=0    unreachable=0    failed=0
    ubuntu1804                 : ok=2    changed=0    unreachable=0    failed=0

Molecule automatically spins up the new platforms and applies the role without any additional configuration (!).

Developing the redis role

The first thing the role needs to do is install redis, which has some nice distro-specific setup to put our multi-platform configuration to the test:

# tasks/main.yml
---
- name: install redis on RedHat-based distros
  block:
    - name: ensure epel repo is installed (RedHat)
      yum:
        name: epel-release
        state: present
        update_cache: true
    - name: ensure redis is installed (RedHat)
      yum:
        name: redis
        state: present
        update_cache: true
  when: ansible_os_family == 'RedHat'

- name: install redis on Debian-based distros
  block:
    - name: ensure redis is installed (Debian)
      apt:
        name: redis-server
        state: present
        update_cache: true
    - name: disable ipv6 binding (Debian)
      lineinfile:
        path: /etc/redis/redis.conf
        regex: '^bind'
        line: bind 127.0.0.1
  when: ansible_os_family == 'Debian'

Running molecule converge will apply the role to all the instances:

$ molecule converge
...
--> Action: 'converge'
    
    PLAY [Converge] **************************************************************
    
    TASK [Gathering Facts] *******************************************************
    ok: [ubuntu1804]
    ok: [centos7]
    ok: [debian9]
    
    TASK [ansible-role-redis : ensure epel repo is installed (RedHat)] ***********
    skipping: [ubuntu1804]
    skipping: [debian9]
    changed: [centos7]
    
    TASK [ansible-role-redis : ensure redis is installed (RedHat)] ***************
    skipping: [ubuntu1804]
    skipping: [debian9]
    changed: [centos7]
    
    TASK [ansible-role-redis : ensure redis is installed (Debian)] ***************
    skipping: [centos7]
    changed: [debian9]
    changed: [ubuntu1804]

    TASK [ansible-role-redis : disable ipv6 binding (Debian)] ********************
    skipping: [centos7]
    ok: [debian9]
    changed: [ubuntu1804]
    
    PLAY RECAP *******************************************************************
    centos7                    : ok=3    changed=2    unreachable=0    failed=0
    debian9                    : ok=3    changed=1    unreachable=0    failed=0
    ubuntu1804                 : ok=3    changed=2    unreachable=0    failed=0

Writing the first test

The role appears to working as expected, but how would you confirm that the package was actually installed on the system? One way would be to get a shell on the instance and manually check that the package was installed, but that’s not ideal.

Molecule enables you to automatically run unit tests against your instances, so why not write a unit test to confirm that redis was installed successfully?

Molecule uses testinfra by default for unit testing the infrastructure, and even gives you a boilerplate file to get you started:

# molecule/default/tests/test_default.py
import os

import testinfra.utils.ansible_runner

testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner(
    os.environ['MOLECULE_INVENTORY_FILE']).get_hosts('all')


def test_hosts_file(host):
    f = host.file('/etc/hosts')

    assert f.exists
    assert f.user == 'root'
    assert f.group == 'root'

The boilerplate file has a single function to verify that an /etc/hosts file exists and that the owner/group is root. You can run this test file against all the instances by running molecule verify:

$ molecule verify
...
--> Action: 'verify'
--> Executing Testinfra tests found in .../molecule/default/tests/...
    ...                                               
    tests/test_default.py ...                                                [100%]
    ...
    ========================== 3 passed in 30.58 seconds ===========================
Verifier completed successfully.

You can replace the test_hosts_file function in molecule/default/tests/test_default.py with a function to test that the correct redis package was installed on all the instances:

# molecule/default/tests/test_default.py
...
def test_redis_installed(host):
    redis_package_name = _get_redis_package_name(host.system_info.distribution)
    redis_package = host.package(redis_package_name)
    assert redis_package.is_installed


def _get_redis_package_name(host_distro):
    return {
        "ubuntu": "redis-server",
        "debian": "redis-server",
        "centos": "redis"
    }.get(host_distro, "redis")

Re-run molecule verify to execute the tests:

$ molecule verify
    ...
    ========================== 3 passed in 27.35 seconds ===========================
Verifier completed successfully.

Not the fastest tests in the world, but definitely faster than logging into each instance and manually testing that the package was actually installed. Testing in this way also allows for automated testing with CI/CD.

Dealing with services

I would like the role to make sure that redis service is running and also starts on boot. I can achieve both of these with a single task using the service module:

# tasks/main.yml
---
- name: install redis on RedHat-based distros
...

- name: install redis on Debian-based distros
...

- name: ensure redis service is started and enabled
  service:
    name: redis
    state: started
    enabled: true

Unfortunately this task fails with the current setup because the official base images for CentOS, Ubuntu and Debian are not design for running system services with systemd/initd:

$ molecule converge
    ...
    TASK [ansible-role-redis : ensure redis service is started and enabled] ********
    fatal: [ubuntu1804]: FAILED! => {"changed": false, "msg": "Could not find the requested service redis: "}
    fatal: [debian9]: FAILED! => {"changed": false, "msg": "Could not find the requested service redis: "}
    fatal: [centos7]: FAILED! => {"changed": false, "msg": "Could not find the requested service redis: "}

You will need service-enabled Docker images to test anything service-related. Luckily there are some awesome Docker images maintained by Jeff Geerling (@geerlingguy) that are set up specifically for this sort of Ansible/Molecule testing. There are images for multiple versions of CentOS, Ubuntu, Debian and Fedora.

I highly recommend using Jeff’s images and I have forked them in order to make some minor improvements. My forks are basically identical to Jeff’s except that they:

  1. Apply a fix to solve high CPU usage zombie processes when using Molecule
  2. Include an ansible user so that you can test playbooks as a non-root, sudo-enabled user

My forks are available here and for this tutorial I’ll make use of 3 to match the previous platforms:

Note: these images are only intended for testing Ansible roles.

Update platforms to use pre-built, service-enabled images

You can make use of the pre-built, service-enabled images by updating the platforms in molecule/default/molecule.yml as shown:

# molecule/default/molecule.yml
...
platforms:
  - name: centos7
    image: "percygrunwald/docker-centos7-ansible:latest"
    command: ""
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: true
  - name: ubuntu1804
    image: "percygrunwald/docker-ubuntu1804-ansible:latest"
    command: ""
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: true
  - name: debian9
    image: "percygrunwald/docker-debian9-ansible:latest"
    command: ""
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: true
...

Setting pre_build_image: true instructs Molecule to pull images directly from Docker Hub instead of building an image with the boilerplate molecule/default/Dockerfile.j2. The other settings (command, volumes, privileged) are all required for running services in the container.

After changing platforms, make sure to run molecule destroy before molecule converge so that new instances are created with the new images:

$ molecule destroy
...

$ molecule converge
...
    TASK [ansible-role-redis : ensure redis service is started and enabled] ********
    changed: [centos7]
    changed: [ubuntu1804]
    changed: [debian9]
    
    PLAY RECAP *********************************************************************
    centos7                    : ok=4    changed=2    unreachable=0    failed=0
    debian9                    : ok=4    changed=2    unreachable=0    failed=0
    ubuntu1804                 : ok=4    changed=3    unreachable=0    failed=0

Test that the redis service is running and enabled

I’ll add a test to molecule/default/tests/test_default.py to confirm the redis service is running and will start on boot:

# molecule/default/tests/test_default.py
...
def test_redis_installed(host):
...


def test_redis_service_started_enabled(host):
    redis_service_name = _get_redis_package_name(host.system_info.distribution)
    redis_service = host.service(redis_service_name)
    assert redis_service.is_running
    assert redis_service.is_enabled


def _get_redis_package_name(host_distro):
...

Re-running the tests with molecule verify shows 6 tests passing (2 for each instance):

$ molecule verify
...
    ========================== 6 passed in 46.28 seconds ===========================
Verifier completed successfully.

Testing with a non-root user

Up until now, the converge task has been running as root because that’s the default Docker user for the official base images as well as Jeff’s derivatives of them.

This is obviously not ideal: we shouldn’t assume that Ansible is running playbooks as root or with become: true. Most cloud providers’ machine images don’t even include a root user and it’s generally best to follow the principle of least privilege by applying become: true only to tasks that require it.

As I mentioned earlier, my forks of Jeff’s images include a non-root user that can sudo, which mirrors the default configuration for machine images on most cloud providers I’m familiar with. For example, the default user for the Ubuntu 18.04 AMI on AWS is ubuntu: a non-root user that can sudo.

The non-root user included on my images is called ansible, so I can modify molecule/default/playbook.yml to use this user instead of root:

# molecule/default/playbook.yml
---
- name: Converge
  hosts: all
  vars:
    ansible_user: ansible
  roles:
    - role: ansible-role-redis

If I run molecule converge on fresh instances with this change, the role will fail because managing packages and services both require root permissions:

...
    TASK [ansible-role-redis : ensure redis is installed (RedHat)] *****************
    skipping: [ubuntu1804]
    skipping: [debian9]
    fatal: [centos7]: FAILED! => {"msg": "...You need to be root to perform this command."...}
    
    TASK [ansible-role-redis : ensure redis is installed (Debian)] *****************
    fatal: [ubuntu1804]: FAILED! => {...}
    fatal: [debian9]: FAILED! => {...}
...

I can add become: true to all the blocks and tasks in tasks/main.yml to apply the required permissions:

# tasks/main.yml
---
- name: install redis on RedHat-based distros
  block:
    ...
  when: ansible_os_family == 'RedHat'
  become: true

- name: install redis on Debian-based distros
  block:
    ...
  when: ansible_os_family == 'Debian'
  become: true

- name: ensure redis service is started and enabled
  service:
    ...
  become: true

molecule converge now finishes without any errors:

$ molecule converge
...
    PLAY RECAP *********************************************************************
    centos7                    : ok=4    changed=2    unreachable=0    failed=0
    debian9                    : ok=4    changed=2    unreachable=0    failed=0
    ubuntu1804                 : ok=4    changed=3    unreachable=0    failed=0

Test the whole role from scratch with molecule test

I’m happy with the current state of the role, so I would like to run everything from scratch to confirm that the role will succeed on fresh infrastructure. This is the perfect use case for molecule test, which will tear down all infrastructure and recreate it before running through all other steps in the full test matrix:

$ molecule test
...
--> Test matrix
└── default
    ├── lint
    ├── destroy
    ├── dependency
    ├── syntax
    ├── create
    ├── prepare
    ├── converge
    ├── idempotence
    ├── side_effect
    ├── verify
    └── destroy
...
--> Action: 'verify'
--> Executing Testinfra tests found in .../molecule/default/tests/...
    ...
    ========================== 6 passed in 48.93 seconds ===========================
Verifier completed successfully.
...
--> Action: 'destroy'
...

molecule test finishes with a destroy task, so you will be left with a clean slate. With molecule test finishing successfully, I’m fairly confident that this Ansible role will run as expected on and CentOS 7, Ubuntu 18.04 and Debian 9 hosts.

Further reading

Comment & Share