Git repository and deployment procedures for CFEngine policies

This is the first installment of three, where I’ll talk about how we structured the git repository for our CFEngine policies together with the deployment policies. This first episode will be about how the repository was (badly) structured before and how we redid it (better), and it will introduce our deployment procedures based on GNU Make. The second installment will talk about how we built upon the deployment procedure and we made it easier. The third installment will be about how we greatly simplified how we manage agent runs by hand on our nodes, so that even the non-CFEngine-savvy can do the right thing with little to no knowledge of CFEngine.

Today I am pretty happy with our code repository structure and our deployment procedures, but getting there was not pain-free.

“What’s the right way to structure a code repository?” This question is asked over and over, everyday and all over the world. And the most common answer is, all probably: “it depends”. There is no general, one-size-fits-all answer, there is no right answer, sorry. Luckily for some, there are some good best practices for some particular domains, but when you are working in a domain that is not mainstream, you’re mostly left on your own.

And I happen to code a lot in a domain that is not mainstream (yet), or not as mainstream as other types of code (yet): configuration management. It is a domain that is still in that phase where a new tool is released every week, claiming to address the shortcomings of all the other tools, and where people are still experimenting a lot with both the tools and the ancillary outfit that goes along with it: programming languages, best practices, version control systems, testing and deployment procedures and whatnot. When it comes to decide which repository structure fits best, all you have is: your knowledge of the problem, your knowledge of the version control system, and some common sense.

Stage 1: the garage times

In the beginning, it was puppet, and it was used in just one project. I had one repository for the puppet code, a separate one for the deployment procedures, while the documentation was in yet another place and versioned separately. I abandoned puppet before the shortcomings of this approach exploded, so it was natural to replicate the same structure when we switched to CFEngine.

Then a second project was added, and some code separation was needed. There was code that was useful in both projects and where one of the projects could take advantage of the improvements done in the other one.

That’s where another bad choice was done. Under the principle of “the master must always be right”, I branched two separate branches for each project, where one was the “project master” and the other was the development branch for that project. Every project branch included all and only those files that were supposed to be deployed in the masterfiles directories on the policy hubs. When improvements were made in one project, those were supposed to be manually merged into master and, from there, into the other project. (If you are thinking that it looks like we had many separate repositories inside a physical one, you’re probably quite right).

Of course, it didn’t take a long time to notice that this approach simply didn’t work. Merging in master was a fully manual work, long and problematic; and in the end the master branch made no sense per-se: it could not be deployed anywhere “as is”. It was supposed to be a sort of library for the common stuff, but it was falling off because of the long intervals between each merge. And that became more and more evident when yet another project came in.

Stage 2: the re-engineering

During the late spring last year, I was fully convinced that we needed a different organization to scale. All those repositories were just too many, and both the common parts and the per-project parts needed to live in the same place; having a solid common set of policies and libraries was as important as the possibility to override those common parts on a per-project basis. Having the repository and its branches “mimic” the masterfiles directory was not possible any longer; rather, we needed to store “parts” of those trees –a common part and a project part– in subdirectories inside the same branch/branches, and merge those subdirectories together at deployment time. Repository branches would still exist only for their normal use: development and integration.

During the summer, the full design was finalized and I started to work to move from the old structure to the new, and from the old repository to a new one. It was not easy, as the structure had to be redone completely: common parts needed to be isolated and stored into their subtree, and so were the parts related to a specific project. It took some iterations and refinements, but the final result was sound. The final structure contains one directory for each project (proj1/, proj2/,… projN/) plus a common/ directory. Each one of these directories is structured like a masterfiles directory (with a separate directory for libraries, controls, services and so forth) and contains its own promises.cf and its own “site library” (a library containing variables, settings and bundles that are special for a certain project). The common directory also contains a tools/ subdirectory with all the ancillary tools and scripts that are in use for all the projects.

When it comes to merging the common and the per-project directory and deploy, I built a Makefile. One would run make deploy and specify a project name, a git branch name, and a hub address or hostname, and make would:

  • create a temporary directory
  • copy the contents of the common directory in it
  • copy the contents of the project directory, overwriting with project-specific files the common ones where it is the case
  • use rsync to deploy the contents of the temporary directory to the policy hub specified
  • remove the temporary directory

The process of merging the common and project parts prior to deployment is illustrated in the image below:

GitRepoStructure

The Makefile also provided a target to deploy on the local filesystem instead of a remote server (deploy_local), targets to preview a change (running rsync -n) before actually deploying the files, and a display target to explain in natural language what would happen if one ran the target deploy instead of display. On top of this foundation we later built the diff target, where we would prepare the files like we were about to deploy them, then download locally the files from the remote hub’s masterfiles directory, and finally run a recursive diff on these two sets to see the differences on a per-file basis.

All this worked pretty well and was functional but still had some annoyances, and we’ll talk about them in the next post. We conclude this first episode with the code of the Makefile, check back on this blog to know how we made the deployment procedure even easier and how we simplified the way we run the agent on a node

Update, February 16th, 2015: the latest code for the Makefile is now on github.

BRANCH=master
MASTERDIR=/var/cfengine/masterfiles
LOCALDIR=/var/cfengine/git
SERVER=_UNDEFINED_

RSYNC_USER=root
RSYNC_PREPARE_OPTS=-a
RSYNC_COMMON_OPTS=--delete --exclude .git --delete-excluded --no-owner --no-group --no-perms --no-times --checksum
RSYNC_OPTS=$(RSYNC_PREPARE_OPTS) -vi $(RSYNC_COMMON_OPTS)

DIFF_OPTS=-r -w -N

TMP_BASE=/var/tmp
TMP_TEMPLATE=cf-deploy-tmp-XXXXXXXX
TMP_DIR:=$(shell /bin/mktemp -d --tmpdir=$(TMP_BASE) $(TMP_TEMPLATE))
DIFF_DIR:=$(shell /bin/mktemp -d --tmpdir=$(TMP_BASE) $(TMP_TEMPLATE))

usage:
	@echo "Usage:"
	@echo "  make deploy PROJECT=projectname SERVER=servername"
	@echo "  make deploy_local PROJECT=projectname"
	@echo ""
	@echo "  use BRANCH, MASTERDIR, LOCALDIR, PROJECT, RSYNC_USER to customise"
	@echo ""
	@echo "  To preview a change, replace deploy with preview above"
	@echo "  (make preview ... / make preview_local ...)"
	@echo ""
	@echo "  To get a diff of repository code against deployed code"
	@echo "  make diff/make diff_local (same options as above)"
	@echo ""
	@echo "  To cleanup our leftovers in $(TMP_BASE)"
	@echo "  make distclean"
	@echo ""
	@echo "  For a verbose explanation, use make display (same parms as deploy)"
	@echo ""


# MAIN TARGETS #########################################################

deploy:        prepare sync_remote     cleanup

deploy_local:  prepare sync_local      cleanup

deploy_multi:  prepare sync_multi      cleanup

preview:       prepare syncview_remote cleanup

preview_local: prepare syncview_local  cleanup

preview_multi: prepare syncview_multi  cleanup

diff:          prepare prepare_diff    run_diff cleanup

diff_local:    prepare run_diff_local  cleanup

display:
	@echo "Would checkout branch:         $(BRANCH)"
	@echo "Would deploy project:          $(PROJECT)"
	@echo "Would deploy source directory: $(LOCALDIR)"
	@echo "Local destination is:      $(MASTERDIR)"
	@echo "Remote destination is:     $(SERVER)'s $(MASTERDIR) directory"
	@echo "Remote user is:            $(RSYNC_USER)"
	@echo ""
	@echo "Use make deploy/make deploy_local to proceed"


# HELPER TARGETS #######################################################

prepare: /bin/mktemp $(TMP_DIR) git_update
	rsync $(RSYNC_PREPARE_OPTS) $(LOCALDIR)/common/     $(TMP_DIR)/
	rsync $(RSYNC_PREPARE_OPTS) $(LOCALDIR)/$(PROJECT)/ $(TMP_DIR)/

prepare_diff: $(DIFF_DIR)
	rsync -z $(RSYNC_PREPARE_OPTS) $(RSYNC_COMMON_OPTS) $(RSYNC_USER)@$(SERVER):$(MASTERDIR)/ $(DIFF_DIR)/

run_diff:
	-diff $(DIFF_OPTS) $(DIFF_DIR)/ $(TMP_DIR)/

run_diff_local:
	-diff $(DIFF_OPTS) $(MASTERDIR) $(TMP_DIR)

cleanup: $(TMP_DIR)
	rm -rf $(TMP_DIR)
	rm -rf $(DIFF_DIR)

distclean:
	-rm -rf $(TMP_BASE)/cf-deploy-tmp-*


git_update: $(LOCALDIR)
	cd $(LOCALDIR) && git checkout $(BRANCH)
	-cd $(LOCALDIR) && git pull

sync_remote:
	rsync -z $(RSYNC_OPTS) $(TMP_DIR)/ $(RSYNC_USER)@$(SERVER):$(MASTERDIR)/

sync_multi:
	for SERVER in `cat $(HUB_LISTS)` ; \
	do \
	rsync -z $(RSYNC_OPTS) $(TMP_DIR)/ $(RSYNC_USER)@$$SERVER:$(MASTERDIR)/ ; \
	done

sync_local:
	rsync $(RSYNC_OPTS) $(TMP_DIR)/ $(MASTERDIR)/

syncview_remote:
	rsync -n $(RSYNC_OPTS) $(TMP_DIR)/ $(RSYNC_USER)@$(SERVER):$(MASTERDIR)/

syncview_multi:
	for SERVER in `cat $(HUB_LISTS)` ; \
	do \
	echo "on $$SERVER" ; \
	rsync -n $(RSYNC_OPTS) $(TMP_DIR)/ $(RSYNC_USER)@$$SERVER:$(MASTERDIR)/ ; \
	echo "" ; \
	done

syncview_local:
	rsync -n $(RSYNC_OPTS) $(TMP_DIR)/ $(MASTERDIR)/

not_supported:
	@echo "The action you requested is not supported"
	exit 1

/bin/mktemp:
	@echo "Your system doesn't have mktemp"
	@echo "Using this procedure without mktemp could seriously damage"
	@echo "your system, bailing out"
	exit 16
Advertisement

4 thoughts on “Git repository and deployment procedures for CFEngine policies

  1. Pingback: cf-deploy: easier deployment of CFEngine policies | A sysadmin's logbook

  2. Pingback: cfe: agent runs made easier | A sysadmin's logbook

  3. Pingback: My round of conferences in February | A sysadmin's logbook

  4. Pingback: cf-deploy v2 released | A sysadmin's logbook

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.