Preface
A number of contrasting definitions exist today for "configuration management"; it is enough to search "what is configuration management?" in Google and skim through the results. Since different definitions would lead to different interpretations of what a configuration management tool is, it's better to stop introduce the subject before we even start talking. Besides, since "Configuration Management System" would lead to the acronym "CMS" which is widely used now for stuff that has nothing to do with configuration management, we'll abdicate and talk about "Configuration management tools" or CMTs.
What is a CMT?
Configuration management tools are, well, tools that aim to empower system administrators to efficiently manage large installations of systems. They generally try to achieve this goal by classifying servers and applying specific configurations to each class, so that hosts in a chosen class are configured in the same way and work in the same way[1].
A strong recommendation (read: a requirement) of these tools is that the tool's configuration files are centralized and versioned in a version control system of some kind. The VCS is not part of the CMT, and the CMT usually doesn't impose any specific one. This is good news, since you can keep using your favourite one. By the way, in large installations you often have different people managing different parts. This may require different access rights to the CMT configuration, which may lead to the choice of a specific VCS that implements some sort of access control.
[1] More or less. We all know from experience that it is not enough for two servers to share the same configuration to work the same way. Besides, more other “noise” can affect the systems (e.g.: a badly hand-configured service, out of the control of the configuration management tool, may disrupt an otherwise well-working service under control).
But the most important thing comes when the human brain is involved. System administrators often tend to think at each system as a single entity that they configure by tweaking configuration files, and tuning the operating system here and there. Sysadmins have the concept of “class of servers” (a set of servers that perform the same function, e.g.: machines x, y and z are web servers). But the class is often created by repeating the same configuration job by hand on each machine.
This is not the case with CMTs. You now manage classes, not (just) servers: you don't configure servers (one or one hundred they may be): you configure a class. And going at the level of, say, the particular service's configuration file is left to the CMT.
When you are using a CMT you should limit low-level activities, directly accessing the operating system entities (files, devices, processes…), and leave it to the CMT to do such things, for a) your changes may be overridden by the CMT, and you'll get the (wrong) impression that it gets in the way instead of helping you, and 2) you'd get back to patch the single instance of the server instead of configuring the whole class.
In other words: if you are going to use a CMT, any CMT, you'll also need to change the way you think about server management. Without that change, a CMT may never fully work for you, since you'll feel it getting in your way rather than helping you.
The tools we'll compare, and how
The tools we'll compare are cfengine and puppet.
With its birth dating back in 1993, Cfengine is probably the oldest, best known, and more powerful tool around. It lays its foundation on the scientific theories of his creator, Marc Burgess. The current version of cfengine is 3.0.5.
Puppet is a younger project, started by Luke Kanies. It is being actively developed and has a community of enthusiast around it. The current stable version is 0.25.5, while the next stable will see a numbering change and become 2.6 (instead of 0.26.x).
Each tool will be analysed under three points of view: architecture (how the functionalities of the tool are organized), philosophy (the principles behind the tool) and peculiarities (things that make the tool special). This comparison will be rather high-level, and will focus on features. Rather than, say, showing examples about how to perform a specific task, we'll try to asnwer the following questions:
- Can this tool handle multiple service clusters (say 10 clusters of a hundred machine each), allowing different policies and access restrictions to be applied for configuration files (per cluster, per team, per user)?
- Can this tool handle multiple datacenters, where a single service is typically deployed in more than one datacenter?
- Does this tool natively integrate with monitoring tools (such as Nagios, Munin…) so that changes in the CMT configuration are reflected in the monitoring tools?
- Is it easy to maintain configuration repositories so that templates can be stored for easy modification and duplication?
- Is configuration change history supported?
- Is change request/approval/notification supported?
Some of this questions can get an answer right away.
Questions 4 and 5 all depend to the chosen VCS. Answer to question 5 is “yes” for any VCS, and answer to question 4 is “yes, once/given that you know the chosen VCS”.
Answer to question 6 is: “No or partially; that much depends on people role's and procedures they follow”. More on this later, along with the answers to questions 1, 2, 3.
Once done with the analysis of both tools, I'll try to answer to the questions above, and I'll talk about things that didn't find an answer or a solution. This will be a separate post.
And that's all for now. I'll see you next time with cfengine.