This is the first installment of three, where I’ll talk about how we structured the git repository for our CFEngine policies together with the deployment policies. This first episode will be about how the repository was (badly) structured before and how we redid it (better), and it will introduce our deployment procedures based on GNU Make. The second installment will talk about how we built upon the deployment procedure and we made it easier. The third installment will be about how we greatly simplified how we manage agent runs by hand on our nodes, so that even the non-CFEngine-savvy can do the right thing with little to no knowledge of CFEngine.
Today I am pretty happy with our code repository structure and our deployment procedures, but getting there was not pain-free.
“What’s the right way to structure a code repository?” This question is asked over and over, everyday and all over the world. And the most common answer is, all probably: “it depends”. There is no general, one-size-fits-all answer, there is no right answer, sorry. Luckily for some, there are some good best practices for some particular domains, but when you are working in a domain that is not mainstream, you’re mostly left on your own.
And I happen to code a lot in a domain that is not mainstream (yet), or not as mainstream as other types of code (yet): configuration management. It is a domain that is still in that phase where a new tool is released every week, claiming to address the shortcomings of all the other tools, and where people are still experimenting a lot with both the tools and the ancillary outfit that goes along with it: programming languages, best practices, version control systems, testing and deployment procedures and whatnot. When it comes to decide which repository structure fits best, all you have is: your knowledge of the problem, your knowledge of the version control system, and some common sense.
Stage 1: the garage times
In the beginning, it was puppet, and it was used in just one project. I had one repository for the puppet code, a separate one for the deployment procedures, while the documentation was in yet another place and versioned separately. I abandoned puppet before the shortcomings of this approach exploded, so it was natural to replicate the same structure when we switched to CFEngine.
Then a second project was added, and some code separation was needed. There was code that was useful in both projects and where one of the projects could take advantage of the improvements done in the other one.
That’s where another bad choice was done. Under the principle of “the master must always be right”, I branched two separate branches for each project, where one was the “project master” and the other was the development branch for that project. Every project branch included all and only those files that were supposed to be deployed in the masterfiles directories on the policy hubs. When improvements were made in one project, those were supposed to be manually merged into master and, from there, into the other project. (If you are thinking that it looks like we had many separate repositories inside a physical one, you’re probably quite right).
Of course, it didn’t take a long time to notice that this approach simply didn’t work. Merging in master was a fully manual work, long and problematic; and in the end the master branch made no sense per-se: it could not be deployed anywhere “as is”. It was supposed to be a sort of library for the common stuff, but it was falling off because of the long intervals between each merge. And that became more and more evident when yet another project came in.
Stage 2: the re-engineering
During the late spring last year, I was fully convinced that we needed a different organization to scale. All those repositories were just too many, and both the common parts and the per-project parts needed to live in the same place; having a solid common set of policies and libraries was as important as the possibility to override those common parts on a per-project basis. Having the repository and its branches “mimic” the masterfiles directory was not possible any longer; rather, we needed to store “parts” of those trees –a common part and a project part– in subdirectories inside the same branch/branches, and merge those subdirectories together at deployment time. Repository branches would still exist only for their normal use: development and integration.
During the summer, the full design was finalized and I started to work to move from the old structure to the new, and from the old repository to a new one. It was not easy, as the structure had to be redone completely: common parts needed to be isolated and stored into their subtree, and so were the parts related to a specific project. It took some iterations and refinements, but the final result was sound. The final structure contains one directory for each project (
projN/) plus a
common/ directory. Each one of these directories is structured like a masterfiles directory (with a separate directory for libraries, controls, services and so forth) and contains its own promises.cf and its own “site library” (a library containing variables, settings and bundles that are special for a certain project). The common directory also contains a
tools/ subdirectory with all the ancillary tools and scripts that are in use for all the projects.
When it comes to merging the common and the per-project directory and deploy, I built a Makefile. One would run
make deploy and specify a project name, a git branch name, and a hub address or hostname, and
- create a temporary directory
- copy the contents of the common directory in it
- copy the contents of the project directory, overwriting with project-specific files the common ones where it is the case
rsyncto deploy the contents of the temporary directory to the policy hub specified
- remove the temporary directory
The process of merging the common and project parts prior to deployment is illustrated in the image below:
The Makefile also provided a target to deploy on the local filesystem instead of a remote server (
deploy_local), targets to preview a change (running
rsync -n) before actually deploying the files, and a
display target to explain in natural language what would happen if one ran the target
deploy instead of display. On top of this foundation we later built the
diff target, where we would prepare the files like we were about to deploy them, then download locally the files from the remote hub’s masterfiles directory, and finally run a recursive diff on these two sets to see the differences on a per-file basis.
All this worked pretty well and was functional but still had some annoyances, and we’ll talk about them in the next post. We conclude this first episode with the code of the Makefile, check back on this blog to know how we made the deployment procedure even easier and how we simplified the way we run the agent on a node
Update, February 16th, 2015: the latest code for the Makefile is now on github.
BRANCH=master MASTERDIR=/var/cfengine/masterfiles LOCALDIR=/var/cfengine/git SERVER=_UNDEFINED_ RSYNC_USER=root RSYNC_PREPARE_OPTS=-a RSYNC_COMMON_OPTS=--delete --exclude .git --delete-excluded --no-owner --no-group --no-perms --no-times --checksum RSYNC_OPTS=$(RSYNC_PREPARE_OPTS) -vi $(RSYNC_COMMON_OPTS) DIFF_OPTS=-r -w -N TMP_BASE=/var/tmp TMP_TEMPLATE=cf-deploy-tmp-XXXXXXXX TMP_DIR:=$(shell /bin/mktemp -d --tmpdir=$(TMP_BASE) $(TMP_TEMPLATE)) DIFF_DIR:=$(shell /bin/mktemp -d --tmpdir=$(TMP_BASE) $(TMP_TEMPLATE)) usage: @echo "Usage:" @echo " make deploy PROJECT=projectname SERVER=servername" @echo " make deploy_local PROJECT=projectname" @echo "" @echo " use BRANCH, MASTERDIR, LOCALDIR, PROJECT, RSYNC_USER to customise" @echo "" @echo " To preview a change, replace deploy with preview above" @echo " (make preview ... / make preview_local ...)" @echo "" @echo " To get a diff of repository code against deployed code" @echo " make diff/make diff_local (same options as above)" @echo "" @echo " To cleanup our leftovers in $(TMP_BASE)" @echo " make distclean" @echo "" @echo " For a verbose explanation, use make display (same parms as deploy)" @echo "" # MAIN TARGETS ######################################################### deploy: prepare sync_remote cleanup deploy_local: prepare sync_local cleanup deploy_multi: prepare sync_multi cleanup preview: prepare syncview_remote cleanup preview_local: prepare syncview_local cleanup preview_multi: prepare syncview_multi cleanup diff: prepare prepare_diff run_diff cleanup diff_local: prepare run_diff_local cleanup display: @echo "Would checkout branch: $(BRANCH)" @echo "Would deploy project: $(PROJECT)" @echo "Would deploy source directory: $(LOCALDIR)" @echo "Local destination is: $(MASTERDIR)" @echo "Remote destination is: $(SERVER)'s $(MASTERDIR) directory" @echo "Remote user is: $(RSYNC_USER)" @echo "" @echo "Use make deploy/make deploy_local to proceed" # HELPER TARGETS ####################################################### prepare: /bin/mktemp $(TMP_DIR) git_update rsync $(RSYNC_PREPARE_OPTS) $(LOCALDIR)/common/ $(TMP_DIR)/ rsync $(RSYNC_PREPARE_OPTS) $(LOCALDIR)/$(PROJECT)/ $(TMP_DIR)/ prepare_diff: $(DIFF_DIR) rsync -z $(RSYNC_PREPARE_OPTS) $(RSYNC_COMMON_OPTS) $(RSYNC_USER)@$(SERVER):$(MASTERDIR)/ $(DIFF_DIR)/ run_diff: -diff $(DIFF_OPTS) $(DIFF_DIR)/ $(TMP_DIR)/ run_diff_local: -diff $(DIFF_OPTS) $(MASTERDIR) $(TMP_DIR) cleanup: $(TMP_DIR) rm -rf $(TMP_DIR) rm -rf $(DIFF_DIR) distclean: -rm -rf $(TMP_BASE)/cf-deploy-tmp-* git_update: $(LOCALDIR) cd $(LOCALDIR) && git checkout $(BRANCH) -cd $(LOCALDIR) && git pull sync_remote: rsync -z $(RSYNC_OPTS) $(TMP_DIR)/ $(RSYNC_USER)@$(SERVER):$(MASTERDIR)/ sync_multi: for SERVER in `cat $(HUB_LISTS)` ; \ do \ rsync -z $(RSYNC_OPTS) $(TMP_DIR)/ $(RSYNC_USER)@$$SERVER:$(MASTERDIR)/ ; \ done sync_local: rsync $(RSYNC_OPTS) $(TMP_DIR)/ $(MASTERDIR)/ syncview_remote: rsync -n $(RSYNC_OPTS) $(TMP_DIR)/ $(RSYNC_USER)@$(SERVER):$(MASTERDIR)/ syncview_multi: for SERVER in `cat $(HUB_LISTS)` ; \ do \ echo "on $$SERVER" ; \ rsync -n $(RSYNC_OPTS) $(TMP_DIR)/ $(RSYNC_USER)@$$SERVER:$(MASTERDIR)/ ; \ echo "" ; \ done syncview_local: rsync -n $(RSYNC_OPTS) $(TMP_DIR)/ $(MASTERDIR)/ not_supported: @echo "The action you requested is not supported" exit 1 /bin/mktemp: @echo "Your system doesn't have mktemp" @echo "Using this procedure without mktemp could seriously damage" @echo "your system, bailing out" exit 16