Test dummies on sale!

DummySaleSomeone in the CFEngine community said that configuration management is a big hammer: you can manage a zillion of systems with ease. Or wreck them, with the same ease. Deploying configuration changes is a risky business. It takes responsibility. And testing, lots of testing.

The biggest shops in town have wonderful continuous integration systems, where every commit spawns a number of virtual machines where the changes are tested and won’t make it into the master branch unless they work correctly. In smaller shops most of the testing phase is done by hand, and it’s absolutely key to have the ability to destroy or create VMs in a snap, well configured and with all the necessary software already on board. We are that kind of small shop, we made that kind of system and we did it on the cheap. If you want to buy yourself one, just read on.

The system at a glance

EsplosoAutomobile

Our “testing factory” consist of the following pieces:

  • a Dell PowerEdge M610 blade server that was being taken out of production, beefed up with 24GB RAM and two 250GB hard disks. The server has a RAID controller on board that we used to set-up a RAID0, so that the operating system sees a single (virtual) disk of about 500GB;
  • a set of IP addresses and names reserved in the DNS;
  • Debian GNU/Linux “wheezy” with KVM as the OS;
  • LVM to manage the storage;
  • virsh, libvirt, virt-manager, golden images, CFEngine policies and LVM snapshots to provision the VMs;
  • GNU make and a makefile to glue the pieces together.

That’s all. No Vagrant, no Docker. Very nice tools, no doubt, but we didn’t anything more than what was available on our OS.  Let’s get into the details.

The hardware

First and foremost, we took one old server that was no longer in production (a Dell PowerEdge M610) and beefed it with unused RAM and two 250GB hard disks, so that it now has 24GB RAM. Then we built a RAID 0 on the two disks so that we have a single logical disk totalling 500GB of raw storage managed via LVM. 8.5GB of storage are reserved for the host system, 10GB for the ISO images of the distributions we want to install and, as of today, 40GB are used for the “golden images”. In total, the storage available to LVM is ~465GB of which ~415 are available for building virtual machines. The host system runs a plain standard Debian GNU/Linux 7.x “wheezy” with KVM.

Our test machines are usually equipped with 512MB RAM and 8GB storage each. If we want to leave 4GB available to the host system, that leaves us room to allocate up to 40VM, which is more than enough for our size. Considering the available storage and still allocating 8GB disk per machine, we could fit up to 50 machines by reducing the RAM for each VM to 384MB.

You’ll probably notice that the server is not redundant and if, for example, one of the disks breaks, the whole test environment will be lost. This is known and acceptable, since no production code will run there and all VM could be destroyed at any point in time.

The IP address pool

We have reserved a pool of 41 IP addresses. The first one is used when preparing a golden image, while the others are associated to a VM. Since the purpose of these VM is to test CFEngine policies it is no surprise that the hostnames associated to the addresses are cf-test-vXX, where XX is a numeric ID of the VM.

Golden images and LVM snapshots

The golden images are nothing more than VM that are installed with the operating system we want to clone for testing. As of today, we have golden images for Debian Linux 5.x “lenny”, 6.x “squeeze”, 7.x “wheezy”, plus a “jessie” and an Ubuntu 12.04 LTS. We create them by using the standard tools for KVM and libvirt, assign them the IP reserved for the golden images and we configure them. Finally, the network is un-configured, the machine is shut down and the image is ready to be cloned. You’ll find the details of the process to create a golden image at the end of this post.

VM provisioning

FordAssemblyLineProvisioning a VM by hand

To provision a virtual machine we need to perform a number of operations:

  • create a snapshot of a golden image;
  • configure a new VM in KVM to use the snapshot as filesystem by using the virt-clone command;
  • mount the VM filesystem, copy the policies in it;
  • run the CFEngine agent chrooted in the VM to configure the VM;
  • finalize the configuration by setting up SSH and CFEngine keys
  • unmount the filesystem, run the VM via virsh and return its VNC display ID in case we want to access the console.

The first step, snapshot a golden image, requires the ID of the snapshot of the golden image. Configuring a VM requires the ID of the VM itself so that the name and IP address can be set. These two parameters are the absolute minimum and that’s what we used. You may want to make more things parametric and gain in flexibility at the expense of simplicity, your mileage may vary.

Let’s get our hands dirty: assuming that we are provisioning cf-test-v01 as a Debian wheezy machine, we’ll start with cloning the golden image:

lvcreate --snapshot /dev/vd/golden-wheezy-root --size 8GB --name v01-root --permission rw

This snapshot statically allocates 8GB of storage. Internally, LVM will save in the snapshot only the differences between the golden image and the new volume. If the size of the snapshot is the same as the cloned volume, it won’t really fill up unless you really throw all those data in it. If you plan to have very short-lived machines, you could use smaller snapshots and allocate more machines. Again, YMMV.

Now that you have the a new filesystem for the VM you need to ask KVM to build a VM on it:

virt-clone --original=golden-wheezy --file=/dev/vd/v01-root --preserve-data --name=v01

Here we go. That’s done, too. Now you have a new machine, called v01 that is an exact copy of the machine golden-wheezy. But hold on, you’re not done yet.

If you run that machine, it would be… well, we said that, the same as golden-wheezy: same name, same IP address… you could change its configuration by hand, but why? We have CFEngine! Let’s peek into the clone: the following command will try to reload the loop module so to ensure that, once we mount the volume, a loop device is created for each partition therein:

modprobe -r loop && modprobe loop max_part=63

Now it’s reasonably safe to link /dev/lv01-root to /dev/loop0:

losetup /dev/loop0 /dev/vd/v01-root

If this operation was successful, we now have a device /dev/loop0pN (where N is an integer) for each partition inside /dev/vd/v01-root. The one we are interested in is the first. We create a mount point for it and mount it:

mkdir /mnt/v01
mount /dev/loop0p1 /mnt/v01

All going well, we now have the root partition of the VM mounted under /mnt/v01. We now have all the pieces in place to start configuring the VM. First thing, we’ll copy the policies in /var/cfengine/masterfiles on the VM’s filesystem:

rsync -avi --delete policy/ /mnt/v01/var/cfengine/masterfiles

The target directory could be anything. I chose /var/cfengine/masterfiles over /var/cfengine/inputs so that the policy doesn’t run again by accident when we boot the VM but it’s still under CFEngine’s work directory that is supposed to contain policies.

Since we make the golden images ourselves, I assume CFEngine is already properly installed there, so it’s safe to chroot and run the agent. To tell the agent which machine we are configuring we use a class with the same name, v01:

chroot /mnt/v01 /var/cfengine/bin/cf-agent -If /var/cfengine/masterfiles/promises.cf -Dv01

(The content of the policies will be shown later on in this post, read on).

The process is run entirely in the VM’s root partition, the machine is configured and, when the agent exists, we are out of the chroot.

You may think we are done here, but actually you aren’t. You still have the SSH host keys of golden-wheezy on board, which will make SSH suspicious when you start connecting to your VMs. You could re-generate the keys each time you provision the VM and that will work, but then SSH will complain once again, this time because when you re-create a VM and connect to it, SSH will find that the identification of the machine has changed. Our solution was to create the keys for the VM and cache them, so that next time we re-provision the same VM the keys will stay the same.

mkdir -p keycache/v01
ssh-keygen -q -f keycache/v01/ssh_host_dsa_key -N '' -t dsa 
ssh-keygen -q -f keycache/v01/ssh_host_rsa_key -N '' -t rsa 
ssh-keygen -q -f keycache/v01/ssh_host_ecdsa_key -N '' -t ecdsa 
cp -v keycache/v01/* /mnt/v01/etc/ssh

Next time we provision v01 we’ll need to just copy the files from keycache/v01.

A similar issue involves CFEngine’s own keys: all machines cloned from a golden image will have the same keys as the golden image, which will be a either a source of confusion for the hubs if you trust keys, or will prevent proper communication between CFEngine clients and the policy hub. As before, we have a key cache for CFEngine keys, too:

mkdir -p ppkeyscache/v01
cf-key -f ppkeyscache/v01/localhost
cp -v ppkeyscache/v01/* /mnt/v01/var/cfengine/ppkeys
chmod 400 /mnt/v01/var/cfengine/ppkeys/localhost.*

We are now safe to unmount the VM’s filesystem

umount /mnt/v01
rmdir /mnt/v01

and shut down the loop association:

losetup -d /dev/loop0

The machine is now configured and ready to boot, let’s go:

virsh start v01

If we wanted to supervise the boot process we could use virsh again to obtain the VNC display ID of it and connect to it:

virsh vncdisplay v01

Decommissioning a VM by hand

Destroying a VM is even simpler and happens in only three steps:

  • kill the VM by using virsh destroy and virsh undefine
  • remove the associated devices by using the dmsetup command
  • remove the snapshot by using lvremove

To destroy the VM we use virsh destroy and virsh undefine:

virsh destroy v01 
virsh undefine v01

Then we need to to “unplug” the devices associated to the VM. They are those listed by the command dmsetup info -c whose name begins with vd-v01. For each of them, we need to issue a dmsetup remove command, and wait for a little while for the command to quiesce. A good old UNIX pipe can do the work for us:

dmsetup info -c | awk "/^vd-01/ { print \$1 }" | sort | while read DEVICE 
do
    dmsetup remove $DEVICE
    sleep 1
done

When all the devices are removed successfully, you can remove the snapshot

lvremove /dev/vd/v01-root

and that’s it.

ModernAssemblyLineAutomating the provisioning procedure

If you review the processes above, you’ll notice that it’s a chain of commands, some of which are so critical that the process must stop if the command fails, and where some commands should run or not depending on the presence of some files. That sounds like a perfect task for the make tool. In fact, we built a makefile to glue the pieces together. To provision the machine cf-test-v01 with wheezy, the whole bunch of commands shown above is streamlined in one single command line:

make create DIST=wheezy NAME=v01

and when we want to get rid of it it’s, again, one command:

make destroy NAME=v01

How long do the above operations take? Depending on the load of the host machine, it takes about 10-15 seconds to boot a new VM, and about 5 seconds to destroy it. We currently have 29 test machines running. If we wanted to refresh the environment by destroying and re-creating them one by one, that would take between 7 and 10 minutes. Not bad at all for an environment built on the cheap!

Conclusion

If you are a small shop and you have some scrap hardware that no one really uses any more, you may take advantage of our recipe to build a test environment for yourself. Just grab the details from the appendix below and enjoy!

Appendix: the code

The makefile

KEY_CACHE_DIR=keycache
KEY_FILES=$(KEY_CACHE_DIR)/$(NAME) $(KEY_CACHE_DIR)/$(NAME)/ssh_host_dsa_key $(KEY_CACHE_DIR)/$(NAME)/ssh_host_rsa_key $(KEY_CACHE_DIR)/$(NAME)/ssh_host_ecdsa_key $(KEY_CACHE_DIR)/$(NAME)/ssh_host_ed25519_key
PPKEYS_CACHE_DIR=ppkeyscache
PPKEYS_FILES=$(PPKEYS_CACHE_DIR)/$(NAME) $(PPKEYS_CACHE_DIR)/$(NAME)/localhost.pub $(PPKEYS_CACHE_DIR)/$(NAME)/localhost.priv

.SILENT: usage

.PHONY: _namecheck _distcheck

########################################################################
# Main targets
usage:
	echo "Usage:" ;
	echo "	make create  DIST=dist_name NAME=vm_name"
	echo "	make destroy NAME=vm_name"
	echo
	echo -n "Available distros: "
	lvs | perl -alne 'print $$1 if $$F[0] =~ m{^golden-(.+)-root$$}' | xargs
	echo


create: _namecheck _distcheck lvcreate clone initvm start vncdisplay

destroy: _namecheck destroyvm removedevice lvremove

key: _namecheck $(KEY_FILES)

ppkeys: _namecheck $(PPKEYS_FILES)

send:
	rsync -avi --delete --no-owner --no-group --no-perms --exclude .git . root@cf-test:vm/

########################################################################
# Subtargets
lvcreate: _namecheck _distcheck
	lvcreate --snapshot /dev/vd/golden-$(DIST)-root --size 8GB --name $(NAME)-root --permission rw

clone: _namecheck _distcheck
	virt-clone --original=golden-$(DIST) --file=/dev/vd/$(NAME)-root --preserve-data --name=$(NAME)

start: _namecheck
	virsh start $(NAME)

initvm: _namecheck loop_setup mount_loop update_policy configure_vm deploy_keys umount_loop loop_shutdown

vncdisplay: _namecheck
	virsh vncdisplay $(NAME)

destroyvm: _namecheck
	-virsh destroy $(NAME)
	virsh undefine $(NAME)

removedevice: _namecheck
	sleep 1
	dmsetup info -c | awk '/^vd-$(NAME)/ { print $$1 }' | sort| \
	while read DEVICE ; do dmsetup remove $$DEVICE ; sleep 1 ; done

lvremove: _namecheck
	sleep 1
	lvremove /dev/vd/$(NAME)-root

loop_setup: _namecheck /dev/loop0 /dev/vd/$(NAME)-root
	modprobe -r loop && modprobe loop max_part=63
	losetup /dev/loop0 /dev/vd/$(NAME)-root

mount_loop: _namecheck /dev/loop0p1
	-mkdir /mnt/$(NAME)
	mount /dev/loop0p1 /mnt/$(NAME)

update_policy: _namecheck /mnt/$(NAME)/var/cfengine/masterfiles
	rsync -avi --delete policy/ /mnt/$(NAME)/var/cfengine/masterfiles

configure_vm: _namecheck /mnt/$(NAME)
	chroot /mnt/$(NAME) /var/cfengine/bin/cf-agent -If /var/cfengine/masterfiles/promises.cf -D$(NAME)

deploy_keys: key ppkeys
	cp -v $(KEY_CACHE_DIR)/$(NAME)/* /mnt/$(NAME)/etc/ssh
	cp -v $(PPKEYS_CACHE_DIR)/$(NAME)/* /mnt/$(NAME)/var/cfengine/ppkeys
	chmod 400 /mnt/$(NAME)/var/cfengine/ppkeys/localhost.*
	mkdir -m 700 -p /mnt/$(NAME)/root/.ssh
	cat rootkeys/* > /mnt/$(NAME)/root/.ssh/authorized_keys
	chmod 400 /mnt/$(NAME)/root/.ssh/authorized_keys

umount_loop: _namecheck /mnt/$(NAME)
	umount /mnt/$(NAME)
	-rmdir /mnt/$(NAME)

loop_shutdown: /dev/loop0
	losetup -d /dev/loop0

# Not putting a prerequisite here, to avoid make running this target
# over and over. Not having any prerequisites to check, make will
# be contempt if the file already exists, and that will be all.
# _namecheck is delegated to the key target, so not a big deal,
# and if you call any of these targets directly without specifying
# the NAME, you really deserve the result!
$(KEY_CACHE_DIR)/$(NAME):
	mkdir -p $(KEY_CACHE_DIR)/$(NAME)

$(KEY_CACHE_DIR)/$(NAME)/ssh_host_dsa_key:
	-ssh-keygen -q -f $(KEY_CACHE_DIR)/$(NAME)/ssh_host_dsa_key -N '' -t dsa

$(KEY_CACHE_DIR)/$(NAME)/ssh_host_rsa_key:
	-ssh-keygen -q -f $(KEY_CACHE_DIR)/$(NAME)/ssh_host_rsa_key -N '' -t rsa

$(KEY_CACHE_DIR)/$(NAME)/ssh_host_ecdsa_key:
	-ssh-keygen -q -f $(KEY_CACHE_DIR)/$(NAME)/ssh_host_ecdsa_key -N '' -t ecdsa

$(KEY_CACHE_DIR)/$(NAME)/ssh_host_ed25519_key:
	-ssh-keygen -q -f $(KEY_CACHE_DIR)/$(NAME)/ssh_host_ed25519_key -N '' -t ed25519

$(PPKEYS_CACHE_DIR)/$(NAME):
	mkdir -p $(PPKEYS_CACHE_DIR)/$(NAME)

$(PPKEYS_CACHE_DIR)/$(NAME)/localhost.pub $(PPKEYS_CACHE_DIR)/$(NAME)/localhost.priv:
	cf-key -f $(PPKEYS_CACHE_DIR)/$(NAME)/localhost




_namecheck:
	@if [ -z "$(NAME)" ] ; then \
	echo "Please specify a name/description for this device" ; \
		exit 2 ; \
	fi

_distcheck:
	@if [ -z "$(DIST)" ] ; then \
	echo "Please specify a distribution codename for this VM" ; \
		exit 2 ; \
	fi

The policies

The policies below use bundles and bodies from the CFEngine standard library.

promises.cf

############################################################################### 
# 
#   promises.cf - Basic Policy for Community 
# 
############################################################################### 

body common control 
{ 
  bundlesequence => { "get_sysinfo","global","main" }; 

  inputs => { 
	      "cfengine_stdlib.cf", 
	      "fix_files.cf", 
	      "upgrade_system.cf", 
	    }; 

  version => "Automatic setup for Hare"; 
} 

############################################################################### 

bundle common global { 
  classes: 
    debian:: 
      "debian_pre_wheezy" expression => classmatch("debian_[123456]") ; 
      "debian_wheezy_or_newer"   not => "debian_pre_wheezy" ; 

    sysinfo_available:: 
      "running_in_vm" expression => regcmp("^Bochs.*$", 
					   "$(sysinfo.dmi_sys_vendor)") ; 

} 


bundle agent get_sysinfo { 
  vars: 
      "module" string => "$(sys.workdir)/masterfiles/sysinfo" ; 

  files: 
      "$(module)" 
	  comment => "Ensure the module is executable", 
	  perms   => m("0755"), 
	  classes => if_ok("run_sysinfo") ; 

  commands: 
    run_sysinfo:: 
      "$(module)" module => "yes" ; 


  reports: 
    !run_sysinfo:: 


bundle agent main 
{ 
  methods: 
    sysinfo_available:: 
      "fix_files" usebundle => fix_files ; 

    sysinfo_available.running_in_vm:: 
      "upgrade"   usebundle => upgrade_system ; 


  reports: 
    !sysinfo_available:: 
      "System information not available, won't do anything" ; 

    sysinfo_available.!running_in_vm:: 
      "Not running in VM, not upgrading the system" ; 
} 

############################################################################### 
# This part is for cf-agent 
# 
# Settings describing the details of the fixed behavioural promises made by 
# cf-agent. 
############################################################################### 

body agent control 

{ 
      # Global default for time that must elapse before promise will be 
      # rechecked. 
      # Don't keep any promises. 

    any:: 

      # This should normally be set to an interval like 1-5 mins 
      # We set it to one initially to avoid confusion. 

	  ifelapsed => "1"; 

      # Do not send IP/name during server connection if address 
      # resolution is broken. 
      # Comment it out if you do NOT have a problem with DNS 

      # skipidentify => "true"; 

      # The following works together with cfengine runcontrol 
      # (services/cfengine.cf) 
	  abortclasses => { "skip_run" } ; 


      # Environment variables based on Distro 
    debian:: 
	  environment => { 
			   "DEBIAN_FRONTEND=noninteractive", 
			   # "APT_LISTBUGS_FRONTEND=none", 
			   # "APT_LISTCHANGES_FRONTEND=none", 
          }; 

}

fix_files.cf

bundle agent fix_files { 
  vars: 
      # association vm name => ip 
      "ip[v01]" string => "192.168.1.101" ; 
      "ip[v02]" string => "192.168.1.102" ; 
      "ip[v03]" string => "192.168.1.103" ; 
      "ip[v04]" string => "192.168.1.104" ; 
      "ip[v05]" string => "192.168.1.105" ; 
      "ip[v06]" string => "192.168.1.106" ; 
      "ip[v07]" string => "192.168.1.107" ; 
      "ip[v08]" string => "192.168.1.108" ; 
      "ip[v09]" string => "192.168.1.109" ; 
      "ip[v10]" string => "192.168.1.110" ; 
      "ip[v11]" string => "192.168.1.111" ; 
      "ip[v12]" string => "192.168.1.112" ; 
      "ip[v13]" string => "192.168.1.113" ; 
      "ip[v14]" string => "192.168.1.114" ; 
      "ip[v15]" string => "192.168.1.115" ; 
      "ip[v16]" string => "192.168.1.116" ; 
      "ip[v17]" string => "192.168.1.117" ; 
      "ip[v18]" string => "192.168.1.118" ; 
      "ip[v19]" string => "192.168.1.119" ; 
      "ip[v20]" string => "192.168.1.120" ; 
      "ip[v21]" string => "192.168.1.121" ; 
      "ip[v22]" string => "192.168.1.122" ; 
      "ip[v23]" string => "192.168.1.123" ; 
      "ip[v24]" string => "192.168.1.124" ; 
      "ip[v25]" string => "192.168.1.125" ; 
      "ip[v26]" string => "192.168.1.126" ; 
      "ip[v27]" string => "192.168.1.127" ; 
      "ip[v28]" string => "192.168.1.128" ; 
      "ip[v29]" string => "192.168.1.129" ; 
      "ip[v30]" string => "192.168.1.130" ; 
      "ip[v31]" string => "192.168.1.131" ; 
      "ip[v32]" string => "192.168.1.132" ; 
      "ip[v33]" string => "192.168.1.133" ; 
      "ip[v34]" string => "192.168.1.134" ; 
      "ip[v35]" string => "192.168.1.135" ; 
      "ip[v36]" string => "192.168.1.136" ; 
      "ip[v37]" string => "192.168.1.137" ; 
      "ip[v38]" string => "192.168.1.138" ; 
      "ip[v39]" string => "192.168.1.139" ; 
      "ip[v40]" string => "192.168.1.140" ; 


      "names" 
	  slist   => getindices("ip"), 
	  comment => "All VMs' names" ; 

      "ipclass[$(names)]" 
	  string  => canonify("$(ip[$(names)])"), 
	  comment => "class names for the IP of each VM" ; 

      "shortname" 
	  comment    => "set a short name if $(names) is defined", 
	  string     => "$(names)", 
	  ifvarclass => "$(names)" ; 

    has_shortname:: 
      "hostname" string => "cf-test-$(shortname)" ; 
      "fullname" string => "$(hostname).example.com" ; 
      "myip"     string => "$(ip[$(shortname)])" ; 

  classes: 
    any:: 
      "after_first_step" 
	  expression => "has_shortname", 
	  comment    => "this class will be defined only the pass after has_shortname is defined" ; 

      "$(names)"      expression => "$(ipclass[$(names)])" ; 
      "has_shortname" expression => isvariable("shortname") ; 
 

  methods: 
    after_first_step:: 
      "interfaces" 
	  usebundle => fix_interfaces("$(fix_files.myip)"), 
	  comment   => "This will set up /etc/network/interfaces" ; 

      "hostname" 
	  usebundle => fix_hostname, 
	  comment   => "This will fix /etc/hostname" ; 

      "mailname" 
	  usebundle => fix_mailname, 
	  comment   => "This will fix /etc/mailname" ; 

      "hosts" 
	  usebundle => fix_hosts, 
	  comment   => "This will fix /etc/hosts" ; 

      "postfix" 
	  usebundle => fix_postfix, 
	  comment   => "This will fix /etc/postfix/main.cf" ; 

      "net_rules" 
	  usebundle => fix_net_rules, 
	  comment   => "This will fix  /etc/udev/rules.d/70-persistent-net.rules" ; 

      "fix_rc_local" 
          usebundle => fix_rc_local, 
	  comment   => "Update rc.local to ping the gateway" ; 

      "cfengine" 
	  usebundle => disable_cfe3, 
	  comment   => "This will disable cfengine3 at boot" ; 



  commands: 
    running_in_vm.restart_postfix:: 
      "/etc/init.d/postfix restart" ; 

  reports: 
    after_first_step:: 
      "Identified this machine as $(shortname)" ; 
      "Hostname set to $(hostname)" ; 
      "FQDN set to $(fullname)" ; 

    running_in_vm.restart_postfix:: 
      "postfix configuration fixed, service restarted" ; 

    !running_in_vm.restart_postfix:: 
      "postfix configuration fixed, but not running in a VM: service NOT restarted" ; 

} 


######################################################################## 
# Ensuring correct information into /etc/network/interfaces 
bundle agent fix_interfaces(ip) { 
  vars: 
      "file" string => "/etc/network/interfaces" ; 

  files: 
      "$(file)" 
	  comment       => "Interface configuration from template", 
	  edit_template => "$(sys.workdir)/masterfiles/interfaces.tmpl", 
	  edit_defaults => empty, 
	  classes       => if_repaired("require_reboot") ; 

  reports: 
    !running_in_vm.require_reboot:: 
      "Interface configuration changed" ; 

    running_in_vm.require_reboot:: 
      "Interface configuration changed, PLEASE REBOOT" ; 
} 


######################################################################## 
# Ensuring correct information into /etc/hostname 
bundle agent fix_hostname 
{ 
  files: 
      "/etc/hostname" 
	  edit_line     => set_hostname, 
	  edit_defaults => empty ; 
} 

bundle edit_line set_hostname { 
  insert_lines: 
      "$(fix_files.hostname)" ; 
} 


######################################################################## 
# Ensuring correct information into /etc/hosts 
bundle agent fix_hosts{ 
  files: 
      "/etc/hosts" 
	  edit_line => add_host_line, 
	  classes   => if_repaired("report_hosts_fixed") ; 

  reports: 
    report_hosts_fixed:: 
      "/etc/hosts was fixed" ; 
} 

bundle edit_line add_host_line { 
  vars: 
      "shortcut" 
	  slist   => { "myip","fullname","hostname" }, 
	  comment => "Variable to shortcut from fix_files" ; 

      "$(shortcut)" 
	  string  => "$(fix_files.$(shortcut))", 
	  comment => "Shortcut to fix_files.$(shortcut)" ; 

  delete_lines: 
      "127\.0\.1\.1\s+.*" ; 

  insert_lines: 
      "$(myip)$(const.t)$(fullname)$(const.t)$(hostname)" 
	  comment => "Ensuring /etc/hosts has a correct line for us" ; 
} 


######################################################################## 
# Ensuring correct information into /etc/mailname 
bundle agent fix_mailname 
{ 
  files: 
      "/etc/mailname" 
	  edit_line     => set_fullname, 
	  edit_defaults => empty, 
	  classes       => if_repaired("restart_postfix") ; 

} 

bundle edit_line set_fullname { 
  insert_lines: 
      "$(fix_files.fullname)" ; 
} 
# Ensuring correct information into /etc/postfix/main.cf 
bundle agent fix_postfix 
{ 
  vars: 
      "conf[myhostname]"    string => " $(fix_files.fullname)" ; 
      "conf[mydestination]" string => " $(fix_files.fullname), $(fix_files.hostname), localhost" ; 

  files: 
      "/etc/postfix/main.cf" 
	  edit_line => set_variable_values("fix_postfix.conf"), 
	  classes   => if_repaired("restart_postfix") ; 

} 

######################################################################## 
# Ensuring correct information into /etc/udev/rules.d/70-persistent-net.rules 
# HOPEFULLY!!! 
bundle agent fix_net_rules 
{ 
  files: 
      "/etc/udev/rules.d/70-persistent-net.rules" 
	  delete  => tidy, 
	  classes => if_repaired("reboot_net_rules"), 
	  comment => "Remove this file so that it is properly regenerated" ; 

  reports: 
    !running_in_vm.reboot_net_rules:: 
      "persistent-net rules removed" ; 

    running_in_vm.reboot_net_rules:: 
      "persistent-net rules removed, PLEASE REBOOT!" ; 
} 

bundle agent disable_cfe3 
{ 
  vars: 
      "conf[RUN_CF_SERVERD]"  string => "0" ; 
      "conf[RUN_CF_EXECD]"    string => "0" ; 
      "conf[RUN_CF_MONITORD]" string => "0" ; 
      "conf[RUN_CF_HUB]"      string => "0" ; 
 
  files: 
      "/etc/default/cfengine3" 
      	  edit_line => set_variable_values("disable_cfe3.conf"), 
	  classes   => if_repaired("stop_cfe3") ; 

  commands: 
    running_in_vm.stop_cfe3:: 
      "/etc/init.d/cfengine3 stop" ; 
} 

# Fix rc.local, so that a VM sends one ICMP packet to the gateway 
# at boot. This is useful when a machine is destroyed and re-created 
# immediately to update the ARP table on the switch 
bundle agent fix_rc_local 
{ 
  files: 
      "/etc/rc.local" 
        perms => mog("0755","root","root"), 
	copy_from => local_dcp("$(sys.workdir)/masterfiles/rc.local") ; 
}

upgrade_system.cf

bundle agent upgrade_system 
{ 
  vars: 
      "env" string => "DEBIAN_FRONTEND=noninteractive LC_ALL=C" ; 
      "opt" string => "-o Dpkg::Options::=--force-confold -o Dpkg::Options::=--force-confdef --yes" ; 

  commands: 
    !package_list_updated:: 
      "/usr/bin/env $(env) /usr/bin/apt-get $(opt) update" 
          classes => if_ok("package_list_updated") ; 

    package_list_updated:: 
      "/usr/bin/env $(env) /usr/bin/apt-get $(opt) upgrade" 
          classes => if_ok("system_upgraded") ; 

  reports: 
    system_upgraded:: 
      "System upgraded" ; 
}

Scripts

The policies make use of the following scripts: sysinfo is run as a module, while rc.local is copied directly as-is on the VM:

rc.local

#!/bin/sh -e 
# 
# rc.local 
# 
# This script is executed at the end of each multiuser runlevel. 
# Make sure that the script will "exit 0" on success or any other 
# value on error. 
# 
# In order to enable or disable this script just change the execution 
# bits. 
# 
# By default this script does nothing. 

/bin/ping -c 1 -w 1 192.168.1.1 > /dev/null 2>&1 
exit 0

sysinfo

#!/bin/bash 

# Use extended globbing, see man bash 
shopt -s extglob 

DMIDIR=/sys/devices/virtual/dmi/id 

echo "+sysinfo_available" 

for FILE in sys_vendor product_name product_serial product_uuid 
do 
    unset DMI_CONTENT 
    # Get the contents of FILE in DMI_CONTENT, if null, set to "N/A" 
    DMI_CONTENT="`cat $DMIDIR/$FILE 2> /dev/null`" 
    DMI_CONTENT=${DMI_CONTENT:="N/A"} 
    TRIM_DMI_CONTENT=${DMI_CONTENT%%*( )} 
    
    # throw out the result 
    echo "=dmi_${FILE}=${TRIM_DMI_CONTENT}" 
done 

# Print the following only if there is content 
DEBIAN_VERSION="`cat /etc/debian_version 2> /dev/null`" 
if [ -n "${DEBIAN_VERSION}" ] 
then 
    echo "=debian_version=$DEBIAN_VERSION" 
fi 

LOADAVG_INFO="`cat /proc/loadavg 2> /dev/null`" 

if [ -n "${LOADAVG_INFO}" ] 
then 
    LOADAVG=$( echo "${LOADAVG_INFO}" | awk '{ print $1" "$2" "$3 }' ) 
    echo "=loadavg=$LOADAVG" 

    PCOUNT=$( echo "${LOADAVG_INFO}" | awk '{ print $4 }' | awk -F/ '{ print $2 }' ) 
    echo "=process_count=$PCOUNT" 
fi 

echo "=user_count=`who | wc -l`"

How to build a golden image: a checklist

Your first golden image will be a VM that you install on a separate logical volume and whose configuration file you’ll save aside. Later on, when you’ll want to make more golden images and similar to the first one, you’ll start from that one as a model and edit it to fit the new system. In the following checklist we suppose you already have such a “first VM” that you installed with Debian Squeeze and whose configuration file is called golden-squeeze.xml. You’ll also need an ISO image of the installer disk for the new system. To ease the process, is very handy to have virt-manager installed, possibly a recent one, and to have a connection to KVM host configured

  • log in the KVM server as root
  • cd /etc/libvirt/qemu
  • copy one of the golden-*.xml aside to use it as a reference; e.g.:
    cp golden-squeeze.xml golden-test.xml
  • create a new LVM volume for the machine; the following line will create a 8G volume
    lvcreate --size 8G --name golden-$OSID-root vd

    where $OSID is a speaking name for this OS. We’ll refer to $OSID in the rest of the document. the volume must be named golden-$OSID-root (e.g.: golden-wheezy-root)

  • edit it and change at least the following settings:
    • /domain/name: change name: call it golden-$OSID
    • /domain/uuid: remove it to have it set automatically by kvm
    • /domain/os: if not already present, add
      <boot dev='cdrom'/>

      right below

      <boot dev='hd'/>
    • /domain/devices/source[dev]: change it to the name of the LVM volume you’ll be using as golden image (see above);
    • /domain/devices/disk[device=”cdrom”]:
      • if you want to point this machine to the ISO image of the installer disk then you either adapt and existing one, or add it and then adapt it. A sample is included in this page, you should modify it as follows:
        • set disk[type] to file
        • in the same disk element, add a source element:
          <source file='/opt/iso/debian-7.6.0-amd64-netinst.iso'/>
      • if you want to just enable the device, see below:
    • /domain/devices/interface/mac: remove the element to have the MAC automatically set
  • save the file, copy it aside
  • try to define the new VM with:
    virsh define golden-test.xml
  • the machine is now defined and can be booted for the installation:
    virsh start golden-$OSID
    • when requested, set the IP address that was reserved for golden images
  • when the machine is created, installed and fully functional:
    • install your SSH key in root’s authorized_keys to ensure that you’ll be able to reach newly created machines even if something goes wrong in their set-up.
    • install CFEngine
    • install any other software that you may need (e.g.: we install postfix in this phase)
    • disable the network by commenting out
      auto eth0

      in /etc/network/interfaces

    • reboot and check that eth0 is not configured (virt-manager is so handy here!)
    • shut it down

Settings to access the CDROM

Have this in your /domain/devices element:

    <disk type='block' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <target dev='hdc' bus='ide'/>
      <readonly/>
      <address type='drive' controller='0' bus='1' unit='0'/>
    </disk>

Network configuration for a golden image

You’ll set up eth0 in some way like this:

# The primary network interface
# allow-hotplug eth0
# auto eth0
iface eth0 inet static
        address 192.168.1.100
        netmask 255.255.255.0
        network 192.168.1.0
        broadcast 192.168.1.255
        gateway 192.168.1.1

Remember to comment out

auto eth0

before publishing the image

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.