Flying in the night with puppet

Following the inheritance mess I talked about earlier, I had to work hard to make the structure sane. Of course, it is quite difficult to implement such a change in steps; rather, you have to change everything in a single step. Nonetheless, it is possible to go the cautious way. …
I already had a branch in my puppet repository, and it was logical to keep working on that branch. There, I prototyped semi-visually the class changes using graphviz's dot.

Once I was happy enough with the result, I started changing the class manifests, revising the structure each time I stumbled on an unexpected hurdle. At every step, I updated the dot file of the new structure.

At the end of this process, a simple diff between the dot file of the old structure and the new dot file revealed the final version of the changes. It was time to modify the node manifests to reflect the change in the class structure. This, in turn, revealed that a few more classes were not needed any more, and I purged them, getting a neater class graph πŸ™‚

Finished this, it was time to test the new stuff in the test environment. A few little adjustments more, and we were ready for release. Uhm… were we?

OK, let's be overcautious. With a first cycle over all the impacted nodes, I disabled puppet on all of them (puppetd --disable, as you probably know). Then I propagated the changes on the first "distribution server", and everything went well. OK.

I kept changing a few nodes by hand, each time running

puppetd --enable && puppetd --test ; puppetd --disable

After testing the changes on a few nodes per class (and adding a few more minor changes) I was happy enough, and I re-enabled puppet on all nodes so that they could sync. That was yesterday, late afternoon. How to check this morning if all of them were actually happy with the change? Let's see!

When something goes wrong with the catalog (which is the problem I was monitoring), you'll get a message like the following one in the log:

Feb 10 14:20:59 mynode puppetd[21799]: Could not retrieve catalog; skipping run

So, did we have any yesterday? Let's see:

$ cat * | while read NODE ; do echo -n "$NODE: " ; ssh -n root@$NODE "awk '$5~/^puppetd/ && /skipping run/' /var/log/syslog.1 | wc -l" ; done 2> /dev/null

(I am not going to cover why that 2> was needed, sorry)

As you can easily guess, I had the node names in some files in the current directory, and I am counting the occurrences of "skipping run" strings issued by puppetd. Luckily, a few nodes showed a number different from 0. I visited the nodes individually to confirm that they were the nodes that showed some minor problems yesterday during the manual tests. A few similar checks for critical conditions showed that no problems happened after applying the changes.

Done! The new class hierarchy is now in production!

Oh, by the way: is any of the nodes in disabled state?

$ cat * | while read NODE ; do echo -n "$NODE: " ; ssh -n root@$NODE ls /var/lib/puppet/state/puppetdlock || echo ; done 2> /dev/null

No, it's not πŸ˜‰

A final word: while for a few nodes this iterate-over-ssh approach may work, it doesn't really scale. That's why I'll start investigating about mcollective, capistrano and fabric soon.

:bye:

Advertisements

2 thoughts on “Flying in the night with puppet

  1. Fabric doesn't work well with 10+ hosts, unless the multiprocess feature is finally released.Mcollective is really interesting for me, if you find out something more please share. It seems to me it's a PITA to setup, and it's not packaged for Lenny. Maybe Squeeze?

  2. Originally posted by cstrep:

    Fabric doesn't work well with 10+ hosts, unless the multiprocess feature is finally released.

    That's good to know, thanks!Originally posted by cstrep:

    Mcollective is really interesting for me, if you find out something more please share. It seems to me it's a PITA to setup, and it's not packaged for Lenny. Maybe Squeeze?

    I didn't have any time to even start my tests of these products, unfortunately. I finally finished a major time-consuming task, so I'll be on puppet (and ancillary stuff) again soon.And it doesn't seem to be in squeeze either. That will be more fun πŸ™‚

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s