DRY Up Your Vagrant Files

- jnorton - Element 84

The last couple of weeks I have been learning to use Vagrant in support of some testing we are doing of persistence solutions like MongoDB and Riak. If you are not familiar with Vagrant and you are using virtual machines in your development, you really should check it out. Vagrant combined with Packer makes it really easy to spin up and provision virtual machines (locally or in the cloud). This post is not an introduction, to Vagrant, however, so from here on out I’m going to assume you are familiar with it. Onward.

Today I found myself creating a Vagrant file to launch a Riak cluster on Amazon Web Services (AWS) Elastic Compute Cloud (EC2). This involves defining each box in the cluster so Vagrant can bring them up and down, which is a lot easier than using the AWS console. My first attempt looked like this

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
setup_cluster_script = File.read("aws-setup_cluster.sh")
join_cluster_script = File.read("join_cluster.sh")

VAGRANTFILE_API_VERSION = "2"

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
config.vm.define "aws1" do |aws1|

    aws1.vm.box = "aws-ubuntu-12.04-riak"

    # The url from where the box will be fetched if it
    # doesn't already exist on the user's system.
    aws1.vm.box_url = AWS_BOX_URL

    aws1.vm.provider :aws do |aws, override|
      
      aws.access_key_id = ACCESS_KEY_ID
      aws.secret_access_key = SECRET_ACCESS_KEY

      aws.keypair_name = "ubuntu"

      #aws.ami = "ami-91da88f8"
      aws.instance_type = "m1.large"
      aws.region = "us-east-1"
      #aws.availability_zone = "us-east-1a"
      aws.private_ip_address = "10.0.0.101"
      aws.subnet_id = "subnet-25df8544" 
      aws.elastic_ip = true

      override.ssh.username = "ubuntu"
      override.ssh.private_key_path = UBUNTU_SSH_KEY_PATH

      
    end

    aws1.vm.provision "shell", inline: setup_cluster_script.gsub("IP_ADDRESS", "10.0.0.101")


  end

  config.vm.define "aws2" do |aws2|
    aws2.vm.box = "aws-ubuntu-12.04-riak"

    # The url from where the 'config.vm.box' box will be fetched if it
    # doesn't already exist on the user's system.
    aws2.vm.box_url = AWS_BOX_URL

    aws2.vm.provider :aws do |aws, override|
      aws.access_key_id = ACCESS_KEY_ID
      aws.secret_access_key = SECRET_ACCESS_KEY

      aws.keypair_name = "ubuntu"

      #aws.ami = "ami-91da88f8"
      aws.instance_type = "m1.large"
      aws.region = "us-east-1"
      #aws.availability_zone = "us-east-1a"
      aws.private_ip_address = "10.0.0.102"
      aws.subnet_id = "subnet-25df8544" 
      aws.elastic_ip = true

      override.ssh.username = "ubuntu"
      override.ssh.private_key_path = UBUNTU_SSH_KEY_PATH
    end

    aws2.vm.provision "shell", inline: setup_cluster_script.gsub("IP_ADDRESS", "10.0.0.102")
    aws2.vm.provision "shell", inline: join_cluster_script.gsub("IP_ADDRESS", "10.0.0.102")

   config.vm.define "aws3" do |aws3|
    aws3.vm.box = "aws-ubuntu-12.04-riak"

    # The url from where the 'config.vm.box' box will be fetched if it
    # doesn't already exist on the user's system.
    aws3.vm.box_url = AWS_BOX_URL

    aws3.vm.provider :aws do |aws, override|
      aws.access_key_id = ACCESS_KEY_ID
      aws.secret_access_key = SECRET_ACCESS_KEY

      aws.keypair_name = "ubuntu"

      #Saws.ami = "ami-91da88f8"
      aws.instance_type = "m1.large"
      aws.region = "us-east-1"
      #aws.availability_zone = "us-east-1a"
      aws.private_ip_address = "10.0.0.103"
      aws.subnet_id = "subnet-25df8544" 
      aws.elastic_ip = true

      override.ssh.username = "ubuntu"
      override.ssh.private_key_path = UBUNTU_SSH_KEY_PATH
    end

    aws3.vm.provision "shell", inline: setup_cluster_script.gsub("IP_ADDRESS", "10.0.0.103")
    aws3.vm.provision "shell", inline: join_cluster_script.gsub("IP_ADDRESS", "10.0.0.103")

  end

  end

This defines three virtual machines, aws1, aws2, and aws3, which all reside in the same virtual private cloud (VPC) on EC2. aws1 is the first node in my Riak cluster; aws2 and aws3 arejoined to the cluster using the join_cluster_script.sh file during provisioning.

This file works fine – I can spool up and spin down (or destroy) EC2 machines like this

vagrant up --provider aws aws1

or even cooler, spin up multiple boxes in parallel

vagrant up --provider aws /aws[1-5]/

All in all, Vagrant makes it really easy to spend money on AWS. So I started happily adding more machine definitions to the Vagrant file, letting me spin up all the machines I wanted.

It didn’t take long, however, for me to realize that my Vagrant file was getting big, and furthermore, that all those machine definitions looked very similar to one another. I thought to myself, “If this were code I would really want to find a way to DRY it up. But wait! Vagrant files are really just Ruby files! Maybe I can find a way.”

So I replaced all my machine definition with this

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
setup_cluster_script = File.read("aws-setup_cluster.sh")
join_cluster_script = File.read("join_cluster.sh")

NUM_BOXES = 5

VAGRANTFILE_API_VERSION = "2"

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|

  for i in 1..NUM_BOXES
    config.vm.define "aws#{i}" do |aws2|
      aws2.vm.box = AWS_BOX_URL

      # The url from where the 'config.vm.box' box will be fetched if it
      # doesn't already exist on the user's system.
      aws2.vm.box_url = AWS_BOX_URL

      aws2.vm.provider :aws do |aws, override|
        aws.access_key_id = ACCESS_KEY_ID
        aws.secret_access_key = SECRET_ACCESS_KEY

        aws.keypair_name = "ubuntu"

        #aws.ami = "ami-91da88f8"
        aws.instance_type = "m1.large"
        aws.region = "us-east-1"
        #aws.availability_zone = "us-east-1a"
        aws.private_ip_address = "10.0.0.10#{i}"
        aws.subnet_id = "subnet-25df8544" 
        aws.elastic_ip = true

        aws.tags = {"name": "riak-node-#{i}"}

        override.ssh.username = "ubuntu"
        override.ssh.private_key_path = UBUNTU_SSH_KEY_PATH
      end

      aws2.vm.provision "shell", inline: setup_cluster_script.gsub("IP_ADDRESS", "10.0.0.10#{i}")
      aws2.vm.provision "shell", inline: join_cluster_script.gsub("IP_ADDRESS", "10.0.0.10#{i}") if i > 1

    end
  end

Feeling very clever, I fired up my first machine with

vagrant up --provider aws aws1

and it looked like it worked! I got the following from Vagrant

1
2
3
4
5
6
7
8
9
10
11
[aws1] Launching an instance with the following settings...<br>
[aws1]  -- Type: m1.large<br>
[aws1]  -- AMI: ami-49f1af20<br>
[aws1]  -- Region: us-east-1<br>
[aws1]  -- Keypair: ubuntu<br>
[aws1]  -- Subnet ID: subnet-25df8544<br>
[aws1]  -- Private IP: 10.0.0.105<br>
[aws1]  -- Elastic IP: true<br>
[aws1]  -- Block Device Mapping: []<br>
[aws1]  -- Terminate On Shutdown: false<br>
[aws1] Waiting for instance to become "ready"...<br>

and so forth.

But wait, what was with the IP address? 10.0.0.105? That’s not right, it should be 10.0.0.101! What the heck?

At this point I was about ready to give up and move on, but I talked with my colleague Jason Gilman and he convinced me that we should take another look. So we cleaned things up a bit and even poked around with pry, but no dice. Then a Google search led to an example Vagrant file using the same technique. We noticed that the example looked just like ours except for two differences – it called to_sym on the string after config.vm.define and it used an each iteration instead of a for loop.

We tried the call to to_sym, but that changed nothing. Once again I was about ready to give up (sensing a pattern here), but Jason went ahead and replaced the for loop with the each iteration .

Sure enough, it worked! But why, and why had the for loop resulted in aws1 getting the wrong IP address? What was different between the for loop and the each iteration?

Then I remembered something I had read a long time ago, but apparently forgotten, written by James Gray. He gives a great explanation of the evils of for loops in Ruby. And it exactly explains why my first attempt failed.

In a nutshell, the difference between this

for i in 1..NUM_BOXES
end

and this

(1..NUM_BOXES).each do |i|
end

is that the index in the for loop is scoped outside the loop and reused on each iteration. Whereas the index in the each iteration is not. Essentially, the for loop creates a local variable that is scoped outside the loop, whereas the each method creates a new local variable, i, each iteration scoped inside the do block and then assigns the current value to it.

It is important to look at what the call to config.vm.define is actually doing. It essentially takes two arguments, a string naming the defined machine and a block to call when that machine is initialized. The important thing to realize is that the block is not executed inside the loop or iteration, rather, it forms a closure that is evaluated later when we call vagrant up. In the case of the for loop, each of the closures binds to the same index, since i is scoped outside the loop. In the each iteration, each of the closures is bound to a separate variable, since i is now scoped inside the iteration.

So now we can see why my original code was assigning the wrong IP address to aws1. Since all the closures formed by my loop were binding to the same index, they all got the same value for i (the last value assigned to it) when they were evaluated. So "10.0.0.10#{i}" becomes "10.0.0.105". We can also see why the each iteration works; since each closure binds to its own i variable and that variable is never changed after being assigned, each closure has the correct value for i when "10.0.0.10#{i}" is interpolated. So each machine gets a separate IP address and things work.

You may wonder why the original code worked at all, why the strings interpolated in the loop for the machine names, e.g., "aws#{i}", didn’t all end up with the same value the way the IP addresses did. The reason for this is that the string interpolation in that case is not part of the closure; it is evaluated inside the loop and uses the current value of i.

Here is the final version of our code (with Jason’s cleanup)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
VAGRANTFILE_API_VERSION = "2"

NUM_BOXES = 5
IP_OFFSET = 10

def ip_from_num(i)
  "10.0.0.#{100+i+IP_OFFSET}"
end

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|

  (1..NUM_BOXES).each do |i|
    is_main = (i == 1)

    config.vm.define "aws#{i}".to_sym do |aws2|
      aws2.vm.box = AWS_BOX_URL

      # The url from where the 'config.vm.box' box will be fetched if it
      # doesn't already exist on the user's system.
      aws2.vm.box_url = AWS_BOX_URL

      aws2.vm.provider :aws do |aws, override|
        aws.access_key_id = ACCESS_KEY_ID
        aws.secret_access_key = SECRET_ACCESS_KEY

        aws.keypair_name = "ubuntu"

        aws.instance_type = "m1.large"
        aws.region = "us-east-1"
        aws.private_ip_address = ip_from_num(i)
        aws.subnet_id = "subnet-25df8544"
        aws.elastic_ip = true

        aws.tags = {"Name" =&gt; "riak-node-#{i}"}

        override.ssh.username = "ubuntu"
        override.ssh.private_key_path = UBUNTU_SSH_KEY_PATH
      end

      aws2.vm.provision "shell", inline: setup_cluster_script.gsub("IP_ADDRESS", ip_from_num(i))
      unless is_main
        aws2.vm.provision "shell", inline: join_cluster_script.gsub("IP_ADDRESS", ip_from_num(i)).gsub("MAIN_IP", ip_from_num(1))
      end
    end
  end

end

Much better, much DRYer, much easier to read and maintain.

So the takeaway from all of this is that you can and should apply the DRY principle to your Vagrant files just as you would to any of your other code. Just be sure you understand what those machine definition blocks are really doing.