How to Speed Up Your Ansible Playbooks Over 600%


Percy Grunwald's Profile Picture

Written by Percy Grunwald

— Last Updated August 2, 2020

Running Ansible playbooks against many hosts can be painfully slow, especially if they are in a region far away from you. I’m always looking for ways to speed up my playbooks and I’ve been using SSH pipelining in Ansible for quite some time. I feel like pipelining has improved the speed of my playbooks, but I’ve never actually tested it.

Recently a tool called Mitogen crossed my path and it claims to give a 1.25x - 7x speedup in Ansible – a bold claim I was immediately skeptical of. Could Mitogen really give a 7x speed increase, and is it any faster than good ol’ pipelining? This was a perfect opportunity to put Mitogen’s claims to the test and finally see how big an improvement pipelining actually gives.

Ansible SSH Pipelining vs Mitogen

tl;dr – skip straight to the benchmark setup or benchmark results.

Video tutorial

If you prefer watching to reading, here’s a full video tutorial from the TopTechSkills YouTube channel covering a many of the points and examples from this article. Feel free to comment on this article or the video if you have any questions.

▶️ Watch on YouTube

What is pipelining?

The Ansible docs describe pipelining like this:

Pipelining … reduces the number of network operations required to execute a module … by executing many Ansible modules without actual file transfer. This can result in a very significant performance improvement when enabled.

By default, Ansible executes tasks by copying a file onto the remote host, executing it and reporting the results back to the control machine. As I understand, with pipelining enabled, Ansible can send commands directly to STDIN through a persistent SSH connection, which is much faster than the default process.

How to enable pipelining

You can enable pipelining by simply adding pipelining = True to the [ssh_connection] area of your ansible.cfg or by using the ANSIBLE_PIPELINING and ANSIBLE_SSH_PIPELINING environment variables.

# ansible.cfg
...
[ssh_connection]
pipelining = True

You’ll also need to make sure that requiretty is disabled in /etc/sudoers on the remote host, or become won’t work with pipelining enabled.

What is Mitogen?

From Mitogen’s docs:

[Mitogen] updates Ansible’s slow and wasteful shell-centric implementation with pure-Python equivalents, invoked via highly efficient remote procedure calls to persistent interpreters tunnelled over SSH.

That’s a mouthful, but the next lines are very reassuring:

No changes are required to target hosts. The extension is considered stable and real-world use is encouraged.

Mitogen claims to give a 1.25x to 7x speedup along with various other good things like reduced CPU usage and fewer writes to remote file systems.

How to enable Mitogen for Ansible

Enabling Mitogen for Ansible is as simple as downloading and extracting the plugin, then adding 2 lines to the [defaults] section of your ansible.cfg:

# ansible.cfg

[defaults]
strategy_plugins = /path/to/mitogen-0.2.5/ansible_mitogen/plugins/strategy
strategy = mitogen_linear

I’ve tested these instructions on both CentOS 7 and Ubuntu 18.04 (Bionic) images on EC2, and everything works out of the box.

Benchmark setup

Mitogen and pipelining both claim to give “very significant” speed improvements with minimal changes to existing Ansible code, but how much faster are they, and how do they compare to each other?

What I measured

I wanted to benchmark the impact pipelining and Mitogen had on the following items:

  • Initial run – The first time a playbook runs on a remote host generally takes the longest, as packages are installed and new files are written.
  • Subsequent run – After a playbook has already been run on a host, any subsequent runs of the same playbook are faster due to Ansible’s idempotence checking.
  • Gathering Facts – This task is run before playbook tasks, in order collect information about the remote host, such as ansible_os_family. This task can take a few seconds depending on the ping of the host, so I was interested to see if pipelining or Mitogen had any effect on this.

I tested each item three times on each host for each configuration and averaged the results.

How I measured

The best way I’ve found to time the execution of Ansible playbooks is by enabling the profile_tasks callback. This callback is included with Ansible and all you need to do to enable it is add callback_whitelist = profile_tasks to the [defaults] section of your ansible.cfg:

# ansible.cfg

[defaults]
callback_whitelist = profile_tasks

With the addition of this line, you’ll get timing information for each task as its executing:

TASK [geerlingguy.java : Ensure Java is installed.]
Thursday 21 February 2019  14:20:31 +0800 (0:00:00.259)       0:00:05.408
ok: [ubuntu_1804_sydney]

And also a nice sorted list of tasks at the end of the run:

Thursday 21 February 2019  14:20:43 +0800 (0:00:00.065)       0:00:16.918 ***** 
=============================================================================== 
Gathering Facts --------------------------------------------------------- 3.73s
geerlingguy.java : Ensure Java is installed. ---------------------------- 1.10s
geerlingguy.docker : Install Docker. ------------------------------------ 0.82s
geerlingguy.nginx : Ensure nginx is installed. -------------------------- 0.76s
...

The test playbook

I benchmarked each configuration using a very simple playbook that just applies the 3 most downloaded roles on Ansible Galaxy:

# playbook.yml
---
- name: run tasks on all hosts
  hosts: all
  become: true
  roles:
    - geerlingguy.java
    - geerlingguy.nginx
    - geerlingguy.docker

Not the most realistic playbook, but the runtime is significant enough to check for non-trivial speedup.

Ansible configurations to test

I tested 3 different Ansible configurations on each host:

  1. Default – Vanilla Ansible config with no optimizations
  2. Pipelining – Default + pipelining enabled
  3. Mitogen – Default + Mitogen enabled

The remote hosts

To keep things as simple as possible, I tested the playbook against a single EC2 t2.micro instance running the CentOS 7 base image in 3 different regions:

  • Sydney, Australia (ap-southeast-2, the closest region to me)
  • Tokyo, Japan (ap-northeast-1)
  • North Virginia, USA (us-east-1)

Based on my experience managing hosts in different regions, I know how much ping affects the speed of Ansible playbooks. Running playbooks on hosts with pings of greater than around 150ms can be painfully slow, so I was curious to see how much improvement pipelining or Mitogen could give for hosts at a range of pings.

Here are my average pings to those AWS regions from (sunny) Perth, Western Australia:

Benchmark results

Initial run

The first time a playbook runs on a remote host generally takes the longest as packages are installed and files are written.

Adding pipelining speeds things up really significantly, at 47s to 108s faster than Default. Pipelining also made the difference between the regions much smaller. The average times for ap-northeast-1 and us-east-1 are both nominally the same with pipelining turned on.

The results for Mitogen are even more interesting. Mitogen is faster than Pipelining in all regions, at 59s to 140s faster than Default. Strangely, the regions further away from me are now even faster than my closest region! In fact, us-east-1 is now the fastest of the group. This is a really strange result, but it’s due to yum install tasks running faster in us-east-1 and ap-northeast-1 compared to ap-southeast-2.

Pipelining is 1.56x to 2.09x faster than Default, which is really significant considering that for most hosts it’s just a one-line configuration change on the control machine.

But the winner by a large margin is Mitogen with 1.81x to 3.09x speedup compared to Default, which makes it 16% to 48% faster than Pipelining.

Subsequent run

After a playbook has already been run on a host, any subsequent runs of the same playbook are faster due to Ansible’s idempotence checking. Subsequent runs can still be painfully slow if the ping to the host is large.

Pipelining and Mitogen configurations both have massive reductions in time relative to Default. Pipelining is 22s to 95s faster than Default, while Mitogen is 29s to 115s faster.

The absolute values for Mitogen are worth noting: on the closest instance, the playbook finishes in only 9 seconds and only takes 18 seconds on the furthest instance. For any non-trivial playbook, these sorts of times are unheard of!

Pipelining gives a 2.36x to 3.52x speedup, but it’s absolutely blown out of the water by Mitogen, which surpasses its own promise of 7x by giving a 4.20x to 7.33x speedup! This represents a 633% speed increase compared to Default.

For subsequent runs, Mitogen is 1.78x to 2.08x faster than Pipelining, which is really amazing.

Gathering facts

This task is run before playbook tasks, in order collect information about the remote host, such as ansible_os_family. This task can take a few seconds depending on the ping of the host, so I was interested to see if pipelining or Mitogen had any effect on this.

Adding Mitogen appears to have a negligible effect on the Gathering Facts task compared to the Default configuration, while Pipelining completes the task a few seconds faster. These results aren’t that amazing, but it’s interesting that Mitogen has basically no effect.

Conclusion

It’s clear from the benchmark results that using pipelining or Mitogen can have a huge effect on the speed of your playbooks. Pipelining gave a 1.56x to 2.09x speedup for the initial run and 2.36x to 3.52x speedup for subsequent runs. Those are pretty awesome results for such a small configuration change, but Mitogen is much faster. Amazingly, Mitogen gave a 1.81x to 3.09x speedup for the initial run and 4.20x to 7.33x speedup for subsequent runs, which is about twice as fast as pipelining.

Time savings on this scale represent a big improvement to the DevOps experience. My biggest criticism of DevOps in general compared to programming is that the feedback cycles are so much slower. I think the addition of Mitogen to Ansible could genuinely make DevOps much more enjoyable for a lot of people.

My personal recommendation is that if your infrastructure allows it, you should definitely give Mitogen a try. The speed increase over both the Default and Pipelining configurations are simply huge and I can’t see any real downside to Mitogen for most normal™️ projects.

Further reading

Comment & Share