Someone in the CFEngine community said that configuration management is a big hammer: you can manage a zillion of systems with ease. Or wreck them, with the same ease. Deploying configuration changes is a risky business. It takes responsibility. And testing, lots of testing.
The biggest shops in town™ have wonderful continuous integration systems, where every commit spawns a number of virtual machines where the changes are tested and won’t make it into the master branch unless they work correctly. In smaller shops most of the testing phase is done by hand, and it’s absolutely key to have the ability to destroy or create VMs in a snap, well configured and with all the necessary software already on board. We are that kind of small shop, we made that kind of system and we did it on the cheap. If you want to buy yourself one, just read on.
The system at a glance
Our “testing factory” consist of the following pieces:
- a Dell PowerEdge M610 blade server that was being taken out of production, beefed up with 24GB RAM and two 250GB hard disks. The server has a RAID controller on board that we used to set-up a RAID0, so that the operating system sees a single (virtual) disk of about 500GB;
- a set of IP addresses and names reserved in the DNS;
- Debian GNU/Linux “wheezy” with KVM as the OS;
- LVM to manage the storage;
- virsh, libvirt, virt-manager, golden images, CFEngine policies and LVM snapshots to provision the VMs;
- GNU make and a makefile to glue the pieces together.
That’s all. No Vagrant, no Docker. Very nice tools, no doubt, but we didn’t anything more than what was available on our OS. Let’s get into the details.
The hardware
First and foremost, we took one old server that was no longer in production (a Dell PowerEdge M610) and beefed it with unused RAM and two 250GB hard disks, so that it now has 24GB RAM. Then we built a RAID 0 on the two disks so that we have a single logical disk totalling 500GB of raw storage managed via LVM. 8.5GB of storage are reserved for the host system, 10GB for the ISO images of the distributions we want to install and, as of today, 40GB are used for the “golden images”. In total, the storage available to LVM is ~465GB of which ~415 are available for building virtual machines. The host system runs a plain standard Debian GNU/Linux 7.x “wheezy” with KVM.
Our test machines are usually equipped with 512MB RAM and 8GB storage each. If we want to leave 4GB available to the host system, that leaves us room to allocate up to 40VM, which is more than enough for our size. Considering the available storage and still allocating 8GB disk per machine, we could fit up to 50 machines by reducing the RAM for each VM to 384MB.
You’ll probably notice that the server is not redundant and if, for example, one of the disks breaks, the whole test environment will be lost. This is known and acceptable, since no production code will run there and all VM could be destroyed at any point in time.
The IP address pool
We have reserved a pool of 41 IP addresses. The first one is used when preparing a golden image, while the others are associated to a VM. Since the purpose of these VM is to test CFEngine policies it is no surprise that the hostnames associated to the addresses are cf-test-vXX, where XX is a numeric ID of the VM.
Golden images and LVM snapshots
The golden images are nothing more than VM that are installed with the operating system we want to clone for testing. As of today, we have golden images for Debian Linux 5.x “lenny”, 6.x “squeeze”, 7.x “wheezy”, plus a “jessie” and an Ubuntu 12.04 LTS. We create them by using the standard tools for KVM and libvirt, assign them the IP reserved for the golden images and we configure them. Finally, the network is un-configured, the machine is shut down and the image is ready to be cloned. You’ll find the details of the process to create a golden image at the end of this post.
VM provisioning
Provisioning a VM by hand
To provision a virtual machine we need to perform a number of operations:
- create a snapshot of a golden image;
- configure a new VM in KVM to use the snapshot as filesystem by using the virt-clone command;
- mount the VM filesystem, copy the policies in it;
- run the CFEngine agent chrooted in the VM to configure the VM;
- finalize the configuration by setting up SSH and CFEngine keys
- unmount the filesystem, run the VM via virsh and return its VNC display ID in case we want to access the console.
The first step, snapshot a golden image, requires the ID of the snapshot of the golden image. Configuring a VM requires the ID of the VM itself so that the name and IP address can be set. These two parameters are the absolute minimum and that’s what we used. You may want to make more things parametric and gain in flexibility at the expense of simplicity, your mileage may vary.
Let’s get our hands dirty: assuming that we are provisioning cf-test-v01 as a Debian wheezy machine, we’ll start with cloning the golden image:
lvcreate --snapshot /dev/vd/golden-wheezy-root --size 8GB --name v01-root --permission rw
This snapshot statically allocates 8GB of storage. Internally, LVM will save in the snapshot only the differences between the golden image and the new volume. If the size of the snapshot is the same as the cloned volume, it won’t really fill up unless you really throw all those data in it. If you plan to have very short-lived machines, you could use smaller snapshots and allocate more machines. Again, YMMV.
Now that you have the a new filesystem for the VM you need to ask KVM to build a VM on it:
virt-clone --original=golden-wheezy --file=/dev/vd/v01-root --preserve-data --name=v01
Here we go. That’s done, too. Now you have a new machine, called v01 that is an exact copy of the machine golden-wheezy. But hold on, you’re not done yet.
If you run that machine, it would be… well, we said that, the same as golden-wheezy: same name, same IP address… you could change its configuration by hand, but why? We have CFEngine! Let’s peek into the clone: the following command will try to reload the loop module so to ensure that, once we mount the volume, a loop device is created for each partition therein:
modprobe -r loop && modprobe loop max_part=63
Now it’s reasonably safe to link /dev/lv01-root to /dev/loop0:
losetup /dev/loop0 /dev/vd/v01-root
If this operation was successful, we now have a device /dev/loop0pN (where N is an integer) for each partition inside /dev/vd/v01-root. The one we are interested in is the first. We create a mount point for it and mount it:
mkdir /mnt/v01 mount /dev/loop0p1 /mnt/v01
All going well, we now have the root partition of the VM mounted under /mnt/v01. We now have all the pieces in place to start configuring the VM. First thing, we’ll copy the policies in /var/cfengine/masterfiles on the VM’s filesystem:
rsync -avi --delete policy/ /mnt/v01/var/cfengine/masterfiles
The target directory could be anything. I chose /var/cfengine/masterfiles over /var/cfengine/inputs so that the policy doesn’t run again by accident when we boot the VM but it’s still under CFEngine’s work directory that is supposed to contain policies.
Since we make the golden images ourselves, I assume CFEngine is already properly installed there, so it’s safe to chroot and run the agent. To tell the agent which machine we are configuring we use a class with the same name, v01:
chroot /mnt/v01 /var/cfengine/bin/cf-agent -If /var/cfengine/masterfiles/promises.cf -Dv01
(The content of the policies will be shown later on in this post, read on).
The process is run entirely in the VM’s root partition, the machine is configured and, when the agent exists, we are out of the chroot.
You may think we are done here, but actually you aren’t. You still have the SSH host keys of golden-wheezy on board, which will make SSH suspicious when you start connecting to your VMs. You could re-generate the keys each time you provision the VM and that will work, but then SSH will complain once again, this time because when you re-create a VM and connect to it, SSH will find that the identification of the machine has changed. Our solution was to create the keys for the VM and cache them, so that next time we re-provision the same VM the keys will stay the same.
mkdir -p keycache/v01 ssh-keygen -q -f keycache/v01/ssh_host_dsa_key -N '' -t dsa ssh-keygen -q -f keycache/v01/ssh_host_rsa_key -N '' -t rsa ssh-keygen -q -f keycache/v01/ssh_host_ecdsa_key -N '' -t ecdsa cp -v keycache/v01/* /mnt/v01/etc/ssh
Next time we provision v01 we’ll need to just copy the files from keycache/v01.
A similar issue involves CFEngine’s own keys: all machines cloned from a golden image will have the same keys as the golden image, which will be a either a source of confusion for the hubs if you trust keys, or will prevent proper communication between CFEngine clients and the policy hub. As before, we have a key cache for CFEngine keys, too:
mkdir -p ppkeyscache/v01 cf-key -f ppkeyscache/v01/localhost cp -v ppkeyscache/v01/* /mnt/v01/var/cfengine/ppkeys chmod 400 /mnt/v01/var/cfengine/ppkeys/localhost.*
We are now safe to unmount the VM’s filesystem
umount /mnt/v01 rmdir /mnt/v01
and shut down the loop association:
losetup -d /dev/loop0
The machine is now configured and ready to boot, let’s go:
virsh start v01
If we wanted to supervise the boot process we could use virsh again to obtain the VNC display ID of it and connect to it:
virsh vncdisplay v01
Decommissioning a VM by hand
Destroying a VM is even simpler and happens in only three steps:
- kill the VM by using virsh destroy and virsh undefine
- remove the associated devices by using the dmsetup command
- remove the snapshot by using lvremove
To destroy the VM we use virsh destroy and virsh undefine:
virsh destroy v01 virsh undefine v01
Then we need to to “unplug” the devices associated to the VM. They are those listed by the command dmsetup info -c whose name begins with vd-v01. For each of them, we need to issue a dmsetup remove command, and wait for a little while for the command to quiesce. A good old UNIX pipe can do the work for us:
dmsetup info -c | awk "/^vd-01/ { print \$1 }" | sort | while read DEVICE do dmsetup remove $DEVICE sleep 1 done
When all the devices are removed successfully, you can remove the snapshot
lvremove /dev/vd/v01-root
and that’s it.
Automating the provisioning procedure
If you review the processes above, you’ll notice that it’s a chain of commands, some of which are so critical that the process must stop if the command fails, and where some commands should run or not depending on the presence of some files. That sounds like a perfect task for the make tool. In fact, we built a makefile to glue the pieces together. To provision the machine cf-test-v01 with wheezy, the whole bunch of commands shown above is streamlined in one single command line:
make create DIST=wheezy NAME=v01
and when we want to get rid of it it’s, again, one command:
make destroy NAME=v01
How long do the above operations take? Depending on the load of the host machine, it takes about 10-15 seconds to boot a new VM, and about 5 seconds to destroy it. We currently have 29 test machines running. If we wanted to refresh the environment by destroying and re-creating them one by one, that would take between 7 and 10 minutes. Not bad at all for an environment built on the cheap!
Conclusion
If you are a small shop and you have some scrap hardware that no one really uses any more, you may take advantage of our recipe to build a test environment for yourself. Just grab the details from the appendix below and enjoy!
Appendix: the code
The makefile
KEY_CACHE_DIR=keycache KEY_FILES=$(KEY_CACHE_DIR)/$(NAME) $(KEY_CACHE_DIR)/$(NAME)/ssh_host_dsa_key $(KEY_CACHE_DIR)/$(NAME)/ssh_host_rsa_key $(KEY_CACHE_DIR)/$(NAME)/ssh_host_ecdsa_key $(KEY_CACHE_DIR)/$(NAME)/ssh_host_ed25519_key PPKEYS_CACHE_DIR=ppkeyscache PPKEYS_FILES=$(PPKEYS_CACHE_DIR)/$(NAME) $(PPKEYS_CACHE_DIR)/$(NAME)/localhost.pub $(PPKEYS_CACHE_DIR)/$(NAME)/localhost.priv .SILENT: usage .PHONY: _namecheck _distcheck ######################################################################## # Main targets usage: echo "Usage:" ; echo " make create DIST=dist_name NAME=vm_name" echo " make destroy NAME=vm_name" echo echo -n "Available distros: " lvs | perl -alne 'print $$1 if $$F[0] =~ m{^golden-(.+)-root$$}' | xargs echo create: _namecheck _distcheck lvcreate clone initvm start vncdisplay destroy: _namecheck destroyvm removedevice lvremove key: _namecheck $(KEY_FILES) ppkeys: _namecheck $(PPKEYS_FILES) send: rsync -avi --delete --no-owner --no-group --no-perms --exclude .git . root@cf-test:vm/ ######################################################################## # Subtargets lvcreate: _namecheck _distcheck lvcreate --snapshot /dev/vd/golden-$(DIST)-root --size 8GB --name $(NAME)-root --permission rw clone: _namecheck _distcheck virt-clone --original=golden-$(DIST) --file=/dev/vd/$(NAME)-root --preserve-data --name=$(NAME) start: _namecheck virsh start $(NAME) initvm: _namecheck loop_setup mount_loop update_policy configure_vm deploy_keys umount_loop loop_shutdown vncdisplay: _namecheck virsh vncdisplay $(NAME) destroyvm: _namecheck -virsh destroy $(NAME) virsh undefine $(NAME) removedevice: _namecheck sleep 1 dmsetup info -c | awk '/^vd-$(NAME)/ { print $$1 }' | sort| \ while read DEVICE ; do dmsetup remove $$DEVICE ; sleep 1 ; done lvremove: _namecheck sleep 1 lvremove /dev/vd/$(NAME)-root loop_setup: _namecheck /dev/loop0 /dev/vd/$(NAME)-root modprobe -r loop && modprobe loop max_part=63 losetup /dev/loop0 /dev/vd/$(NAME)-root mount_loop: _namecheck /dev/loop0p1 -mkdir /mnt/$(NAME) mount /dev/loop0p1 /mnt/$(NAME) update_policy: _namecheck /mnt/$(NAME)/var/cfengine/masterfiles rsync -avi --delete policy/ /mnt/$(NAME)/var/cfengine/masterfiles configure_vm: _namecheck /mnt/$(NAME) chroot /mnt/$(NAME) /var/cfengine/bin/cf-agent -If /var/cfengine/masterfiles/promises.cf -D$(NAME) deploy_keys: key ppkeys cp -v $(KEY_CACHE_DIR)/$(NAME)/* /mnt/$(NAME)/etc/ssh cp -v $(PPKEYS_CACHE_DIR)/$(NAME)/* /mnt/$(NAME)/var/cfengine/ppkeys chmod 400 /mnt/$(NAME)/var/cfengine/ppkeys/localhost.* mkdir -m 700 -p /mnt/$(NAME)/root/.ssh cat rootkeys/* > /mnt/$(NAME)/root/.ssh/authorized_keys chmod 400 /mnt/$(NAME)/root/.ssh/authorized_keys umount_loop: _namecheck /mnt/$(NAME) umount /mnt/$(NAME) -rmdir /mnt/$(NAME) loop_shutdown: /dev/loop0 losetup -d /dev/loop0 # Not putting a prerequisite here, to avoid make running this target # over and over. Not having any prerequisites to check, make will # be contempt if the file already exists, and that will be all. # _namecheck is delegated to the key target, so not a big deal, # and if you call any of these targets directly without specifying # the NAME, you really deserve the result! $(KEY_CACHE_DIR)/$(NAME): mkdir -p $(KEY_CACHE_DIR)/$(NAME) $(KEY_CACHE_DIR)/$(NAME)/ssh_host_dsa_key: -ssh-keygen -q -f $(KEY_CACHE_DIR)/$(NAME)/ssh_host_dsa_key -N '' -t dsa $(KEY_CACHE_DIR)/$(NAME)/ssh_host_rsa_key: -ssh-keygen -q -f $(KEY_CACHE_DIR)/$(NAME)/ssh_host_rsa_key -N '' -t rsa $(KEY_CACHE_DIR)/$(NAME)/ssh_host_ecdsa_key: -ssh-keygen -q -f $(KEY_CACHE_DIR)/$(NAME)/ssh_host_ecdsa_key -N '' -t ecdsa $(KEY_CACHE_DIR)/$(NAME)/ssh_host_ed25519_key: -ssh-keygen -q -f $(KEY_CACHE_DIR)/$(NAME)/ssh_host_ed25519_key -N '' -t ed25519 $(PPKEYS_CACHE_DIR)/$(NAME): mkdir -p $(PPKEYS_CACHE_DIR)/$(NAME) $(PPKEYS_CACHE_DIR)/$(NAME)/localhost.pub $(PPKEYS_CACHE_DIR)/$(NAME)/localhost.priv: cf-key -f $(PPKEYS_CACHE_DIR)/$(NAME)/localhost _namecheck: @if [ -z "$(NAME)" ] ; then \ echo "Please specify a name/description for this device" ; \ exit 2 ; \ fi _distcheck: @if [ -z "$(DIST)" ] ; then \ echo "Please specify a distribution codename for this VM" ; \ exit 2 ; \ fi
The policies
The policies below use bundles and bodies from the CFEngine standard library.
promises.cf
############################################################################### # # promises.cf - Basic Policy for Community # ############################################################################### body common control { bundlesequence => { "get_sysinfo","global","main" }; inputs => { "cfengine_stdlib.cf", "fix_files.cf", "upgrade_system.cf", }; version => "Automatic setup for Hare"; } ############################################################################### bundle common global { classes: debian:: "debian_pre_wheezy" expression => classmatch("debian_[123456]") ; "debian_wheezy_or_newer" not => "debian_pre_wheezy" ; sysinfo_available:: "running_in_vm" expression => regcmp("^Bochs.*$", "$(sysinfo.dmi_sys_vendor)") ; } bundle agent get_sysinfo { vars: "module" string => "$(sys.workdir)/masterfiles/sysinfo" ; files: "$(module)" comment => "Ensure the module is executable", perms => m("0755"), classes => if_ok("run_sysinfo") ; commands: run_sysinfo:: "$(module)" module => "yes" ; reports: !run_sysinfo:: bundle agent main { methods: sysinfo_available:: "fix_files" usebundle => fix_files ; sysinfo_available.running_in_vm:: "upgrade" usebundle => upgrade_system ; reports: !sysinfo_available:: "System information not available, won't do anything" ; sysinfo_available.!running_in_vm:: "Not running in VM, not upgrading the system" ; } ############################################################################### # This part is for cf-agent # # Settings describing the details of the fixed behavioural promises made by # cf-agent. ############################################################################### body agent control { # Global default for time that must elapse before promise will be # rechecked. # Don't keep any promises. any:: # This should normally be set to an interval like 1-5 mins # We set it to one initially to avoid confusion. ifelapsed => "1"; # Do not send IP/name during server connection if address # resolution is broken. # Comment it out if you do NOT have a problem with DNS # skipidentify => "true"; # The following works together with cfengine runcontrol # (services/cfengine.cf) abortclasses => { "skip_run" } ; # Environment variables based on Distro debian:: environment => { "DEBIAN_FRONTEND=noninteractive", # "APT_LISTBUGS_FRONTEND=none", # "APT_LISTCHANGES_FRONTEND=none", }; }
fix_files.cf
bundle agent fix_files { vars: # association vm name => ip "ip[v01]" string => "192.168.1.101" ; "ip[v02]" string => "192.168.1.102" ; "ip[v03]" string => "192.168.1.103" ; "ip[v04]" string => "192.168.1.104" ; "ip[v05]" string => "192.168.1.105" ; "ip[v06]" string => "192.168.1.106" ; "ip[v07]" string => "192.168.1.107" ; "ip[v08]" string => "192.168.1.108" ; "ip[v09]" string => "192.168.1.109" ; "ip[v10]" string => "192.168.1.110" ; "ip[v11]" string => "192.168.1.111" ; "ip[v12]" string => "192.168.1.112" ; "ip[v13]" string => "192.168.1.113" ; "ip[v14]" string => "192.168.1.114" ; "ip[v15]" string => "192.168.1.115" ; "ip[v16]" string => "192.168.1.116" ; "ip[v17]" string => "192.168.1.117" ; "ip[v18]" string => "192.168.1.118" ; "ip[v19]" string => "192.168.1.119" ; "ip[v20]" string => "192.168.1.120" ; "ip[v21]" string => "192.168.1.121" ; "ip[v22]" string => "192.168.1.122" ; "ip[v23]" string => "192.168.1.123" ; "ip[v24]" string => "192.168.1.124" ; "ip[v25]" string => "192.168.1.125" ; "ip[v26]" string => "192.168.1.126" ; "ip[v27]" string => "192.168.1.127" ; "ip[v28]" string => "192.168.1.128" ; "ip[v29]" string => "192.168.1.129" ; "ip[v30]" string => "192.168.1.130" ; "ip[v31]" string => "192.168.1.131" ; "ip[v32]" string => "192.168.1.132" ; "ip[v33]" string => "192.168.1.133" ; "ip[v34]" string => "192.168.1.134" ; "ip[v35]" string => "192.168.1.135" ; "ip[v36]" string => "192.168.1.136" ; "ip[v37]" string => "192.168.1.137" ; "ip[v38]" string => "192.168.1.138" ; "ip[v39]" string => "192.168.1.139" ; "ip[v40]" string => "192.168.1.140" ; "names" slist => getindices("ip"), comment => "All VMs' names" ; "ipclass[$(names)]" string => canonify("$(ip[$(names)])"), comment => "class names for the IP of each VM" ; "shortname" comment => "set a short name if $(names) is defined", string => "$(names)", ifvarclass => "$(names)" ; has_shortname:: "hostname" string => "cf-test-$(shortname)" ; "fullname" string => "$(hostname).example.com" ; "myip" string => "$(ip[$(shortname)])" ; classes: any:: "after_first_step" expression => "has_shortname", comment => "this class will be defined only the pass after has_shortname is defined" ; "$(names)" expression => "$(ipclass[$(names)])" ; "has_shortname" expression => isvariable("shortname") ; methods: after_first_step:: "interfaces" usebundle => fix_interfaces("$(fix_files.myip)"), comment => "This will set up /etc/network/interfaces" ; "hostname" usebundle => fix_hostname, comment => "This will fix /etc/hostname" ; "mailname" usebundle => fix_mailname, comment => "This will fix /etc/mailname" ; "hosts" usebundle => fix_hosts, comment => "This will fix /etc/hosts" ; "postfix" usebundle => fix_postfix, comment => "This will fix /etc/postfix/main.cf" ; "net_rules" usebundle => fix_net_rules, comment => "This will fix /etc/udev/rules.d/70-persistent-net.rules" ; "fix_rc_local" usebundle => fix_rc_local, comment => "Update rc.local to ping the gateway" ; "cfengine" usebundle => disable_cfe3, comment => "This will disable cfengine3 at boot" ; commands: running_in_vm.restart_postfix:: "/etc/init.d/postfix restart" ; reports: after_first_step:: "Identified this machine as $(shortname)" ; "Hostname set to $(hostname)" ; "FQDN set to $(fullname)" ; running_in_vm.restart_postfix:: "postfix configuration fixed, service restarted" ; !running_in_vm.restart_postfix:: "postfix configuration fixed, but not running in a VM: service NOT restarted" ; } ######################################################################## # Ensuring correct information into /etc/network/interfaces bundle agent fix_interfaces(ip) { vars: "file" string => "/etc/network/interfaces" ; files: "$(file)" comment => "Interface configuration from template", edit_template => "$(sys.workdir)/masterfiles/interfaces.tmpl", edit_defaults => empty, classes => if_repaired("require_reboot") ; reports: !running_in_vm.require_reboot:: "Interface configuration changed" ; running_in_vm.require_reboot:: "Interface configuration changed, PLEASE REBOOT" ; } ######################################################################## # Ensuring correct information into /etc/hostname bundle agent fix_hostname { files: "/etc/hostname" edit_line => set_hostname, edit_defaults => empty ; } bundle edit_line set_hostname { insert_lines: "$(fix_files.hostname)" ; } ######################################################################## # Ensuring correct information into /etc/hosts bundle agent fix_hosts{ files: "/etc/hosts" edit_line => add_host_line, classes => if_repaired("report_hosts_fixed") ; reports: report_hosts_fixed:: "/etc/hosts was fixed" ; } bundle edit_line add_host_line { vars: "shortcut" slist => { "myip","fullname","hostname" }, comment => "Variable to shortcut from fix_files" ; "$(shortcut)" string => "$(fix_files.$(shortcut))", comment => "Shortcut to fix_files.$(shortcut)" ; delete_lines: "127\.0\.1\.1\s+.*" ; insert_lines: "$(myip)$(const.t)$(fullname)$(const.t)$(hostname)" comment => "Ensuring /etc/hosts has a correct line for us" ; } ######################################################################## # Ensuring correct information into /etc/mailname bundle agent fix_mailname { files: "/etc/mailname" edit_line => set_fullname, edit_defaults => empty, classes => if_repaired("restart_postfix") ; } bundle edit_line set_fullname { insert_lines: "$(fix_files.fullname)" ; } # Ensuring correct information into /etc/postfix/main.cf bundle agent fix_postfix { vars: "conf[myhostname]" string => " $(fix_files.fullname)" ; "conf[mydestination]" string => " $(fix_files.fullname), $(fix_files.hostname), localhost" ; files: "/etc/postfix/main.cf" edit_line => set_variable_values("fix_postfix.conf"), classes => if_repaired("restart_postfix") ; } ######################################################################## # Ensuring correct information into /etc/udev/rules.d/70-persistent-net.rules # HOPEFULLY!!! bundle agent fix_net_rules { files: "/etc/udev/rules.d/70-persistent-net.rules" delete => tidy, classes => if_repaired("reboot_net_rules"), comment => "Remove this file so that it is properly regenerated" ; reports: !running_in_vm.reboot_net_rules:: "persistent-net rules removed" ; running_in_vm.reboot_net_rules:: "persistent-net rules removed, PLEASE REBOOT!" ; } bundle agent disable_cfe3 { vars: "conf[RUN_CF_SERVERD]" string => "0" ; "conf[RUN_CF_EXECD]" string => "0" ; "conf[RUN_CF_MONITORD]" string => "0" ; "conf[RUN_CF_HUB]" string => "0" ; files: "/etc/default/cfengine3" edit_line => set_variable_values("disable_cfe3.conf"), classes => if_repaired("stop_cfe3") ; commands: running_in_vm.stop_cfe3:: "/etc/init.d/cfengine3 stop" ; } # Fix rc.local, so that a VM sends one ICMP packet to the gateway # at boot. This is useful when a machine is destroyed and re-created # immediately to update the ARP table on the switch bundle agent fix_rc_local { files: "/etc/rc.local" perms => mog("0755","root","root"), copy_from => local_dcp("$(sys.workdir)/masterfiles/rc.local") ; }
upgrade_system.cf
bundle agent upgrade_system { vars: "env" string => "DEBIAN_FRONTEND=noninteractive LC_ALL=C" ; "opt" string => "-o Dpkg::Options::=--force-confold -o Dpkg::Options::=--force-confdef --yes" ; commands: !package_list_updated:: "/usr/bin/env $(env) /usr/bin/apt-get $(opt) update" classes => if_ok("package_list_updated") ; package_list_updated:: "/usr/bin/env $(env) /usr/bin/apt-get $(opt) upgrade" classes => if_ok("system_upgraded") ; reports: system_upgraded:: "System upgraded" ; }
Scripts
The policies make use of the following scripts: sysinfo is run as a module, while rc.local is copied directly as-is on the VM:
rc.local
#!/bin/sh -e # # rc.local # # This script is executed at the end of each multiuser runlevel. # Make sure that the script will "exit 0" on success or any other # value on error. # # In order to enable or disable this script just change the execution # bits. # # By default this script does nothing. /bin/ping -c 1 -w 1 192.168.1.1 > /dev/null 2>&1 exit 0
sysinfo
#!/bin/bash # Use extended globbing, see man bash shopt -s extglob DMIDIR=/sys/devices/virtual/dmi/id echo "+sysinfo_available" for FILE in sys_vendor product_name product_serial product_uuid do unset DMI_CONTENT # Get the contents of FILE in DMI_CONTENT, if null, set to "N/A" DMI_CONTENT="`cat $DMIDIR/$FILE 2> /dev/null`" DMI_CONTENT=${DMI_CONTENT:="N/A"} TRIM_DMI_CONTENT=${DMI_CONTENT%%*( )} # throw out the result echo "=dmi_${FILE}=${TRIM_DMI_CONTENT}" done # Print the following only if there is content DEBIAN_VERSION="`cat /etc/debian_version 2> /dev/null`" if [ -n "${DEBIAN_VERSION}" ] then echo "=debian_version=$DEBIAN_VERSION" fi LOADAVG_INFO="`cat /proc/loadavg 2> /dev/null`" if [ -n "${LOADAVG_INFO}" ] then LOADAVG=$( echo "${LOADAVG_INFO}" | awk '{ print $1" "$2" "$3 }' ) echo "=loadavg=$LOADAVG" PCOUNT=$( echo "${LOADAVG_INFO}" | awk '{ print $4 }' | awk -F/ '{ print $2 }' ) echo "=process_count=$PCOUNT" fi echo "=user_count=`who | wc -l`"
How to build a golden image: a checklist
Your first golden image will be a VM that you install on a separate logical volume and whose configuration file you’ll save aside. Later on, when you’ll want to make more golden images and similar to the first one, you’ll start from that one as a model and edit it to fit the new system. In the following checklist we suppose you already have such a “first VM” that you installed with Debian Squeeze and whose configuration file is called golden-squeeze.xml. You’ll also need an ISO image of the installer disk for the new system. To ease the process, is very handy to have virt-manager installed, possibly a recent one, and to have a connection to KVM host configured
- log in the KVM server as root
cd /etc/libvirt/qemu
- copy one of the golden-*.xml aside to use it as a reference; e.g.:
cp golden-squeeze.xml golden-test.xml
- create a new LVM volume for the machine; the following line will create a 8G volume
lvcreate --size 8G --name golden-$OSID-root vd
where $OSID is a speaking name for this OS. We’ll refer to
$OSID
in the rest of the document. the volume must be named golden-$OSID-root (e.g.: golden-wheezy-root) - edit it and change at least the following settings:
- /domain/name: change name: call it
golden-$OSID
- /domain/uuid: remove it to have it set automatically by kvm
- /domain/os: if not already present, add
<boot dev='cdrom'/>
right below
<boot dev='hd'/>
- /domain/devices/source[dev]: change it to the name of the LVM volume you’ll be using as golden image (see above);
- /domain/devices/disk[device=”cdrom”]:
- if you want to point this machine to the ISO image of the installer disk then you either adapt and existing one, or add it and then adapt it. A sample is included in this page, you should modify it as follows:
- set disk[type] to file
- in the same disk element, add a source element:
<source file='/opt/iso/debian-7.6.0-amd64-netinst.iso'/>
- if you want to just enable the device, see below:
- if you want to point this machine to the ISO image of the installer disk then you either adapt and existing one, or add it and then adapt it. A sample is included in this page, you should modify it as follows:
- /domain/devices/interface/mac: remove the element to have the MAC automatically set
- /domain/name: change name: call it
- save the file, copy it aside
- try to define the new VM with:
virsh define golden-test.xml
- the machine is now defined and can be booted for the installation:
virsh start golden-$OSID
- when requested, set the IP address that was reserved for golden images
- when the machine is created, installed and fully functional:
- install your SSH key in root’s authorized_keys to ensure that you’ll be able to reach newly created machines even if something goes wrong in their set-up.
- install CFEngine
- install any other software that you may need (e.g.: we install postfix in this phase)
- disable the network by commenting out
auto eth0
in /etc/network/interfaces
- reboot and check that eth0 is not configured (virt-manager is so handy here!)
- shut it down
Settings to access the CDROM
Have this in your /domain/devices element:
<disk type='block' device='cdrom'> <driver name='qemu' type='raw'/> <target dev='hdc' bus='ide'/> <readonly/> <address type='drive' controller='0' bus='1' unit='0'/> </disk>
Network configuration for a golden image
You’ll set up eth0 in some way like this:
# The primary network interface # allow-hotplug eth0 # auto eth0 iface eth0 inet static address 192.168.1.100 netmask 255.255.255.0 network 192.168.1.0 broadcast 192.168.1.255 gateway 192.168.1.1
Remember to comment out
auto eth0
before publishing the image