git: some tricks

I used RCS, CVS and SVN in the past, so I am not new to version control. GIT is a bit different from them, by the way, and is new to me, so I need to write down things before I learn them by earth. I am publishing my notes in the hope it will be useful to you, as well. Also, feel free to comment if you’d like to suggest better ways to do the same things.

I am writing things as I use them, so beware this post will still change several times in the future. I hope this doesn’t drive your RSS reader nuts…

(Updated: September 15th, 2014: revert last commit, get a branch/tag’s commit ID)
(Updated: September 7th, 2012: make a local branch remote, check out a file from another branch)
Updated: April 3rd, 2012: new repo with gitolite in the middle)
(Updated: March 14th, 2011: git remote prune)
(Updated: January 27th, 2011)
(Updated: many other times…) … Continue reading

Test dummies on sale!

DummySaleSomeone in the CFEngine community said that configuration management is a big hammer: you can manage a zillion of systems with ease. Or wreck them, with the same ease. Deploying configuration changes is a risky business. It takes responsibility. And testing, lots of testing.

The biggest shops in town have wonderful continuous integration systems, where every commit spawns a number of virtual machines where the changes are tested and won’t make it into the master branch unless they work correctly. In smaller shops most of the testing phase is done by hand, and it’s absolutely key to have the ability to destroy or create VMs in a snap, well configured and with all the necessary software already on board. We are that kind of small shop, we made that kind of system and we did it on the cheap. If you want to buy yourself one, just read on.

Continue reading

cfe: agent runs made easier

This is the third and final installment of a three-post series. In the first post, “git repository and deployment procedures for CFEngine policies”, I explained how we have structured our repository to efficiently manage many projects in one branch and share common policies across projects. In the second post, “cf-deploy: easier deployment of CFEngine policies“, I illustrated the script that we use to deploy the policies on our policy hubs and to preview the changes that would be applied to the policies themselves. In this final post I’ll illustrate yet another tool that helped us simplify the way we run the agent by hand on a node. Although one doesn’t need very often to run the agent by hand on production nodes, that is much more needed on test nodes and when debugging policies, and we do that often enough that it made sense to optimize the process. The principle is the same as the other script, cf-deploy: a makefile does the hard work, and we provide a convenient front-end to run make.

Continue reading

cf-deploy: easier deployment of CFEngine policies

Update: this article refers to the very first version of cf-deploy. For the latest release, check the github repository.


GitRepoStructureIn my latest post “git repository and deployment procedures for CFEngine policies” I explained how we structured our git repository for CFEngine policies, and how we built a deployment procedure, based on GNU make, to easily deploy different projects and branches from the same repository to the policy hubs. Please read that post if you haven’t yet, as this one is not going to make much sense without it.

The make-based deployment procedure worked pretty well and was functional, but still had annoyances. Let’s name a few:

  • the make command line was a bit long and ugly; usually it was something like:
    make -C /var/cfengine/git/common/tools/deploy deploy PROJECT=projX BRANCH=dev-projX-foo SERVER=projX-testhub
  • the Makefile was not optimized to deploy on more than one server at a time. To deploy the same files on several hubs, the only solution was to run make in a cycle several times, as in
    for SERVER in projX-hub{1..10} ; do make -C /var/cfengine/git/common/tools/deploy deploy PROJECT=projX BRANCH=dev-projX-foo SERVER=$SERVER ; done
  • deploying a project on all the policy hubs related to that project required one to remember all of the addresses/hostnames; forget one or more of them, and they would simply, hopelessly left behind.

At the same time, there were a few more people that were interested in making tiny changes to the configurations via ENC and deploy, and that long command line was a bit discouraging. All this taken together meant: I needed to add a multi-hub deployment target to the Makefile, and I needed a wrapper for the deployment process to hide that ugly command line.

For first, I added to the Makefile the functionality needed to deploy on more than one hub without having to re-create the temporary directory at every run: it would prepare the files once, deploy them as many times as needed, and then wipe the temporary directory. That was nice and, indeed, needed. But the wrapper couldn’t wait any longer, and I started working on it immediately after. That’s where cf-deploy was born.

Continue reading

Git repository and deployment procedures for CFEngine policies

This is the first installment of three, where I’ll talk about how we structured the git repository for our CFEngine policies together with the deployment policies. This first episode will be about how the repository was (badly) structured before and how we redid it (better), and it will introduce our deployment procedures based on GNU Make. The second installment will talk about how we built upon the deployment procedure and we made it easier. The third installment will be about how we greatly simplified how we manage agent runs by hand on our nodes, so that even the non-CFEngine-savvy can do the right thing with little to no knowledge of CFEngine.

Continue reading

Compact, plain-text output for tree

I needed to remove all fancy ANSI graphics and colors from the output of tree, and just print something that fit a plain text file. This did the trick:

tree -L 1 --noreport -S -n --charset=US-ASCII

E.g.:

bronto@brabham:/var/cfengine/git/common$ tree -L 1 --noreport -S -n --charset=US-ASCII
.
|-- controls
|-- libraries
|-- modules
|-- services
|-- sources
|-- templates
|-- tools
|-- unit
`-- update.cf

Tracking down memory hogs on Linux

I am working hard on my talk for FOSDEM, and don’t have much time to write detailed posts, so you’ll have to put up with me and be contempt with this “one liners” series 😉

In a memory shortage situation, it’s handy to understand which processes are memory hogs. Not only the ps command can tell you that: it can even return the process list sorted by memory occupation. At that point, getting the “top N” is a piece of cake. The following example shows the top 3 on my machine this morning, after shutting down and restarting a few of them:

bronto@brabham:~$ ps -e -o pid,user,rss,vsz,args --sort -vsz | head -n 4
  PID USER       RSS    VSZ COMMAND
27971 bronto   213896 3099640 /usr/bin/python /usr/bin/hotot
27884 bronto   44572 2306604 liferea
24470 bronto   202724 2164244 /usr/bin/gnome-shell

Hope it helps!

External node classification, the CFEngine way

CFEngineAgentExternal node classification is a Puppet functionality, where it is left to a program external to Puppet (the external node classifier, ENC) to decide which contexts apply to the node being configured. This approach is opposed to the “standard”, basic one, where the configuration applied to a node is completely defined in Puppet files (manifests), and site.pp in particular. ENC programs really show their power and usefulness where the same ruleset is used to manage a large number of servers, with many different combinations of configurations are applied. In this case, pulling the configuration information from a structured data base instead of plain files scales much better. One can write his own ENC, or use one of the several available, hiera being a well-known one.

Note: files and ENC are just two possible ways to classify nodes in Puppet; besides, classification happens on the puppetmaster, while in CFEngine all configuration decisions are taken on the node running the agent. You may read more information on node classification in puppet here, but then we’ll leave you alone and keep reading this post 🙂

Now transpose to CFEngine. Being it generally more “low-level” than Puppet, it provides no ENC mechanism out of the box, but plenty of possibilities to implement one yourself. I first checked what were the available options. I got very nice suggestions, notably one by LinkedIn’s Mike Svoboda, where they use a Yahoo! open source product called Range to store the data about the nodes, then they dump the data in JSON format, and finally they use a bash script run as a CFEngine module to raise the relevant classes. As scalable and sophisticated as it is, it was way too much than what I needed.

ENC, in my case, had two purposes: the first: allow us to scale better (and that’s a common trait in all ENC mechanisms) the second: take as much configuration information as possible out of the policies and in plain text files. This way, the access barrier for the non-CFEngine savvy people in the company would be lowered significantly. This approach is actually closer to Neil Watson’s and EvolveThinking’s CFEngine library (based on CSV files) than to the otherwise wonderful LinkedIn approach.

I sowed this information and ideas in my brain and let them sprout in the background for a couple of weeks, letting the most complex solutions drop. The last one standing was, in my opinion, the best combination of power and simplicity. We’d use plain text files, the CFEngine’s own module protocol, and extra-simple scripts: a bash script for simple external node classification, and a Perl script for hierarchical node classification. I’ll summarize the module protocol first, and then show how we leverage it to achieve ENC.

Continue reading