Creating and terminating test instances in AWS quickly

This is mostly a note to self. When I need an EC2 instance to run a quick test, it may be overly annoying to provision one through the web console, or it may feel a bit overkill to do that using large frameworks like terraform. Using the AWS command line is just fine, if you know what command to run with which parameters, and it pays off quickly if, to run your tests, you use the settings often (AMI, subnet, security groups…) or if during the same test session  you need to scrap and rebuild test instances a few times. Here is an example on how to do so with the AWS command line client.

Continue reading

Rudimentary compliance report for CFEngine

In CFEngine community you don’t have a web GUI with compliance report. You can get them via EvolveThinking’s Delta Reporting, but if you can’t for any reason, you need to find another way.

A poor man’s compliance report at the bundle level can be extracted via the verbose output. This is how I’ve used it to ensure that a clean-up change in the policies didn’t alter the overall behavior:

cf-agent -Kv 2>&1 | perl -lne 'm{verbose: (/.+): Aggregate compliance .+ = (\d+\.\d%)} && print "$1 ($2)"'

These are the first ten lines of output on my workstation:

bronto@brabham:~$ sudo cf-agent -Kv 2>&1 | perl -lne 'm{verbose: (/.+): Aggregate compliance .+ = (\d+\.\d%)} && print "$1 ($2)"' | head -n 10
/default/banner (100.0%)
/default/inventory_control (100.0%)
/default/inventory_autorun/methods/'proc'/default/cfe_autorun_inventory_proc (100.0%)
/default/inventory_autorun/methods/'fstab'/default/cfe_autorun_inventory_fstab (100.0%)
/default/inventory_autorun/methods/'mtab'/default/cfe_autorun_inventory_mtab (100.0%)
/default/inventory_autorun/methods/'dmidecode'/default/cfe_autorun_inventory_dmidecode (100.0%)
/default/inventory_autorun (100.0%)
/default/inventory_linux (100.0%)
/default/inventory_lsb (100.0%)
/default/services_autorun (100.0%)

Not much, but better than nothing and a starting point anyway. There is much more information in the verbose log that you can extract with something slightly more elaborated than this one-liner. Happy data mining, enjoy!

Compact, plain-text output for tree

I needed to remove all fancy ANSI graphics and colors from the output of tree, and just print something that fit a plain text file. This did the trick:

tree -L 1 --noreport -S -n --charset=US-ASCII

E.g.:

bronto@brabham:/var/cfengine/git/common$ tree -L 1 --noreport -S -n --charset=US-ASCII
.
|-- controls
|-- libraries
|-- modules
|-- services
|-- sources
|-- templates
|-- tools
|-- unit
`-- update.cf

Tracking down memory hogs on Linux

I am working hard on my talk for FOSDEM, and don’t have much time to write detailed posts, so you’ll have to put up with me and be contempt with this “one liners” series 😉

In a memory shortage situation, it’s handy to understand which processes are memory hogs. Not only the ps command can tell you that: it can even return the process list sorted by memory occupation. At that point, getting the “top N” is a piece of cake. The following example shows the top 3 on my machine this morning, after shutting down and restarting a few of them:

bronto@brabham:~$ ps -e -o pid,user,rss,vsz,args --sort -vsz | head -n 4
  PID USER       RSS    VSZ COMMAND
27971 bronto   213896 3099640 /usr/bin/python /usr/bin/hotot
27884 bronto   44572 2306604 liferea
24470 bronto   202724 2164244 /usr/bin/gnome-shell

Hope it helps!

Parsing promise_summary.log

In CFEngine's work directory there is a log file called promise_summary.log. Unsurprisingly, it contains a summary of how agent runs went in the past: how many promises were kept, how many repaired, and how many failed to be repaired. Some weeks ago a blog post on nanard.org showed how this file can be parsed to graph cfengine's activity, and I thought it could be a nice thing to do the same thing in Perl.

For those who never had a peek, the file looks like this:

bronto@murray:/var/cfengine$ tail promise_summary.log 
1367869877,1367869877: Outcome of version Community Failsafe.cf 1.0.0 (agent-0): Promises observed to be kept 95%, Promises repaired 0%, Promises not repaired 5%
1367869877,1367869905: Outcome of version MyOwnPC 1.0.16-1 (agent-0): Promises observed to be kept 97%, Promises repaired 3%, Promises not repaired 0%
1367870164,1367870164: Outcome of version Community Failsafe.cf 1.0.0 (agent-0): Promises observed to be kept 95%, Promises repaired 0%, Promises not repaired 5%
1367870164,1367870191: Outcome of version MyOwnPC 1.0.16-1 (agent-0): Promises observed to be kept 97%, Promises repaired 3%, Promises not repaired 0%
1367870450,1367870450: Outcome of version Community Failsafe.cf 1.0.0 (agent-0): Promises observed to be kept 95%, Promises repaired 0%, Promises not repaired 5%
1367870450,1367870475: Outcome of version MyOwnPC 1.0.16-1 (agent-0): Promises observed to be kept 97%, Promises repaired 3%, Promises not repaired 0%
1367870797,1367870797: Outcome of version Community Failsafe.cf 1.0.0 (agent-0): Promises observed to be kept 95%, Promises repaired 0%, Promises not repaired 5%
1367870797,1367870825: Outcome of version MyOwnPC 1.0.16-1 (agent-0): Promises observed to be kept 97%, Promises repaired 3%, Promises not repaired 0%
1367871083,1367871084: Outcome of version Community Failsafe.cf 1.0.0 (agent-0): Promises observed to be kept 95%, Promises repaired 0%, Promises not repaired 5%
1367871084,1367871111: Outcome of version MyOwnPC 1.0.16-1 (agent-0): Promises observed to be kept 97%, Promises repaired 3%, Promises not repaired 0%

The first thing, and I did it in some ten minutes, is to parse the file to extract the relevant information. Check this one-liner:

perl -F: -alne 'next if m{failsafe.cf}i ; my ($start,$finish) = split(",",$F[0],2) ; my $duration = $finish-$start ; my ($version) = m{version (.+) (} ; my ($kept,$rep,$notrep) = m{Promises observed to be kept (d+)%, Promises repaired (d+)%, Promises not repaired (d+)%} ; print join("t",$start,$duration,$version,$kept,$rep,$notrep)' promise_summary.log

Or, reformatted:

#!/usr/bin/perl -F: -aln

next if m{failsafe.cf}i ;

my ($start,$finish) = split(",",$F[0],2) ;
my $duration = $finish-$start ;
my ($version) = m{version (.+) (} ;
my ($kept,$rep,$notrep) = m{Promises observed to be kept (d+)%, Promises repaired (d+)%, Promises not repaired (d+)%} ;

print join("t",$start,$duration,$version,$kept,$rep,$notrep)

This parses the file and outputs the relevant information tab-separated.

The command line explained

  • -a turns on the "autosplit" feature, where the input is split (normally at spaces) and the chunks are put in the @F array; in this specific case however, the "-F:" switch will make Perl split at colons ":";
  • -l will make it so that newlines are added after each print;
  • -n will wrap the code around a while loop, reading from the file(s) given as argument, and put each line in the $_ "default" variable;
  • -e will execute the code given on the command line;

The code explained

  • we skip the lines generated by the failsafe policy
  • we split the first field (timestamps), and calculate the difference (that is: the duration of the run)
  • we match the version string
  • we match the percentage for kept, repaired, and not repaired promises;
  • we print the bunch in tab-separated format

This is just the start, of course. E.g.: rather than tab-separated values one could print the values in a format suitable to rrdtool, and then use rrdgraph to create the graphs… But that's for later, take care for now. Ciao!

An eye on the clock

Recently I was trying to make sane a system clock that, for some reason, suddenly slowed down to a crawl. I started to fiddle with adjtimex, and I needed a way to verify the reaction of the clock itself, and ntpd’s. On one window, I had a watch ntpq -c pe -c as running. On another, there was an ntpdate -q in a loop. In a third one, I wanted to monitor the changes in frequency (set by ntpd) and ticks (set by me). I found this one-liner pretty useful:

# while true ; do adjtimex -p | perl -alne '/frequency/ and $f=sprintf("%3.2f%%",100*$F[1]/32768000) ; /tick/ and $t=$F[1]-10000 ; END { print scalar(localtime),"\tf=$f\tt=$t" }' ; sleep 30 ; done

This snippet assumes that the frequency tolerance of the clock is 32768000, and that the normal value for ticks is 10000. Check the output of adjtimex -p and the man page to verify this fits your system, too.

Detecting runaway processes

A simple one liner I used to take a stroll on a few systems, and see where I had a runaway backend process. This uses ps's command extended options (-o); I almost never use those, so I thought it was better to make a note to myself 🙂
That sort command could be crafted better, but it did its job in this case.

$ for SERVER in x{1..6} ; do echo $SERVER ; ssh -n $SERVER 'ps -C backend -o bsdtime,pid,comm | grep -v TIME | sort -rn | head -n 3' ; done 

I'd like to investigate how to manage such a situation automatically using CFEngine. Will try that sooner or later 😉

Faking a process name


Sometimes, for testing purposes, you may need to pretend you have more processes named, e.g., ntpd than you actually do. In my case, I was testing a cfengine policy that should kill all processes named ntpd if they are more than one, and then start a new ntpd from a clean state.

Doing that in perl is very easy. In fact, not only the $0 variable holds the name of the process, but it works also the other way round: if you assign $0, it will change the process name in the system's process table.

So, to fake an ntpd process, you can create a perl script called ntpd like this one:

#!/usr/bin/perl

$0 = q{ntpd} ;
sleep 600 ;

and you run it.

This trick allowed me to test if my policy worked as expected (it did, by the way 🙂

Can you do it with a one-liner? Of course you can:

perl -e '$0="ntpd" ; sleep 600'

Enjoy!