Reading one-line lists with the Bash shell

Commands like the AWS CLI may return a list of values all in one line, where each item in the list is separated by the nearby items with spaces. Using a plain read command doesn’t really work: read will read all the values in one go into the variable. You need to change the delimiter that read uses to split the input. No need to pipe the output through Perl or other tools, read got you covered with the -d option.

In this example I get the list of the ARNs of all target groups in an AWS account, and then iterate over those ARNs to list all the instances in each target group. The ouput will also be saved into a file through the tee command:

aws elbv2 describe-target-groups \
  --query 'TargetGroups[].TargetGroupArn' \
  --output text | \
  while read -d ' ' ARN ; do \
    echo -n "$ARN: " ; \
    aws elbv2 describe-target-health \
      --target-group-arn "$ARN" \
      --query 'TargetHealthDescriptions[].Target.Id' \
      --output text ; sleep 1 ; \
  done | \
  tee tg-instances.txt

The ouput of this one liner will be in the format:

ARN: instance_ID [instance_ID...]

Things to notice:

  • the AWS CLI’s describe-target-groups command will list all target groups’ ARNs thanks to the --query option and list as many as possible on single lines, according to the shell’s output buffer capacity; the ouput is piped through a while loop;
  • the while loop uses read -d ' ' to split each line at spaces and save each item in the $ARN variable, one per cycle;
  • the echo command prints the value of $ARN followed by a colon, a space, but will not output a newline sequence due to the -n option;
  • the AWS CLI’s describe-target-health command will list all target IDs thanks to the --query option and print them out in a single line; it will also provide a newline sequence, so that the next loop will start on a new line;
  • the sleep 1 command slows down the loop, so that we don’t hammer the API to the point that they will rate limit us;
  • finally, the tee command will duplicate the output of the while loop to both the standard output and the file tg-instances.txt.

Creating and terminating test instances in AWS quickly

This is mostly a note to self. When I need an EC2 instance to run a quick test, it may be overly annoying to provision one through the web console, or it may feel a bit overkill to do that using large frameworks like terraform. Using the AWS command line is just fine, if you know what command to run with which parameters, and it pays off quickly if, to run your tests, you use the settings often (AMI, subnet, security groups…) or if during the same test session  you need to scrap and rebuild test instances a few times. Here is an example on how to do so with the AWS command line client.

Continue reading

Rudimentary compliance report for CFEngine

In CFEngine community you don’t have a web GUI with compliance report. You can get them via EvolveThinking’s Delta Reporting, but if you can’t for any reason, you need to find another way.

A poor man’s compliance report at the bundle level can be extracted via the verbose output. This is how I’ve used it to ensure that a clean-up change in the policies didn’t alter the overall behavior:

cf-agent -Kv 2>&1 | perl -lne 'm{verbose: (/.+): Aggregate compliance .+ = (\d+\.\d%)} && print "$1 ($2)"'

These are the first ten lines of output on my workstation:

bronto@brabham:~$ sudo cf-agent -Kv 2>&1 | perl -lne 'm{verbose: (/.+): Aggregate compliance .+ = (\d+\.\d%)} && print "$1 ($2)"' | head -n 10
/default/banner (100.0%)
/default/inventory_control (100.0%)
/default/inventory_autorun/methods/'proc'/default/cfe_autorun_inventory_proc (100.0%)
/default/inventory_autorun/methods/'fstab'/default/cfe_autorun_inventory_fstab (100.0%)
/default/inventory_autorun/methods/'mtab'/default/cfe_autorun_inventory_mtab (100.0%)
/default/inventory_autorun/methods/'dmidecode'/default/cfe_autorun_inventory_dmidecode (100.0%)
/default/inventory_autorun (100.0%)
/default/inventory_linux (100.0%)
/default/inventory_lsb (100.0%)
/default/services_autorun (100.0%)

Not much, but better than nothing and a starting point anyway. There is much more information in the verbose log that you can extract with something slightly more elaborated than this one-liner. Happy data mining, enjoy!

Compact, plain-text output for tree

I needed to remove all fancy ANSI graphics and colors from the output of tree, and just print something that fit a plain text file. This did the trick:

tree -L 1 --noreport -S -n --charset=US-ASCII

E.g.:

bronto@brabham:/var/cfengine/git/common$ tree -L 1 --noreport -S -n --charset=US-ASCII
.
|-- controls
|-- libraries
|-- modules
|-- services
|-- sources
|-- templates
|-- tools
|-- unit
`-- update.cf

Tracking down memory hogs on Linux

I am working hard on my talk for FOSDEM, and don’t have much time to write detailed posts, so you’ll have to put up with me and be contempt with this “one liners” series 😉

In a memory shortage situation, it’s handy to understand which processes are memory hogs. Not only the ps command can tell you that: it can even return the process list sorted by memory occupation. At that point, getting the “top N” is a piece of cake. The following example shows the top 3 on my machine this morning, after shutting down and restarting a few of them:

bronto@brabham:~$ ps -e -o pid,user,rss,vsz,args --sort -vsz | head -n 4
  PID USER       RSS    VSZ COMMAND
27971 bronto   213896 3099640 /usr/bin/python /usr/bin/hotot
27884 bronto   44572 2306604 liferea
24470 bronto   202724 2164244 /usr/bin/gnome-shell

Hope it helps!

Parsing promise_summary.log

In CFEngine's work directory there is a log file called promise_summary.log. Unsurprisingly, it contains a summary of how agent runs went in the past: how many promises were kept, how many repaired, and how many failed to be repaired. Some weeks ago a blog post on nanard.org showed how this file can be parsed to graph cfengine's activity, and I thought it could be a nice thing to do the same thing in Perl.

For those who never had a peek, the file looks like this:

bronto@murray:/var/cfengine$ tail promise_summary.log 
1367869877,1367869877: Outcome of version Community Failsafe.cf 1.0.0 (agent-0): Promises observed to be kept 95%, Promises repaired 0%, Promises not repaired 5%
1367869877,1367869905: Outcome of version MyOwnPC 1.0.16-1 (agent-0): Promises observed to be kept 97%, Promises repaired 3%, Promises not repaired 0%
1367870164,1367870164: Outcome of version Community Failsafe.cf 1.0.0 (agent-0): Promises observed to be kept 95%, Promises repaired 0%, Promises not repaired 5%
1367870164,1367870191: Outcome of version MyOwnPC 1.0.16-1 (agent-0): Promises observed to be kept 97%, Promises repaired 3%, Promises not repaired 0%
1367870450,1367870450: Outcome of version Community Failsafe.cf 1.0.0 (agent-0): Promises observed to be kept 95%, Promises repaired 0%, Promises not repaired 5%
1367870450,1367870475: Outcome of version MyOwnPC 1.0.16-1 (agent-0): Promises observed to be kept 97%, Promises repaired 3%, Promises not repaired 0%
1367870797,1367870797: Outcome of version Community Failsafe.cf 1.0.0 (agent-0): Promises observed to be kept 95%, Promises repaired 0%, Promises not repaired 5%
1367870797,1367870825: Outcome of version MyOwnPC 1.0.16-1 (agent-0): Promises observed to be kept 97%, Promises repaired 3%, Promises not repaired 0%
1367871083,1367871084: Outcome of version Community Failsafe.cf 1.0.0 (agent-0): Promises observed to be kept 95%, Promises repaired 0%, Promises not repaired 5%
1367871084,1367871111: Outcome of version MyOwnPC 1.0.16-1 (agent-0): Promises observed to be kept 97%, Promises repaired 3%, Promises not repaired 0%

The first thing, and I did it in some ten minutes, is to parse the file to extract the relevant information. Check this one-liner:

perl -F: -alne 'next if m{failsafe.cf}i ; my ($start,$finish) = split(",",$F[0],2) ; my $duration = $finish-$start ; my ($version) = m{version (.+) (} ; my ($kept,$rep,$notrep) = m{Promises observed to be kept (d+)%, Promises repaired (d+)%, Promises not repaired (d+)%} ; print join("t",$start,$duration,$version,$kept,$rep,$notrep)' promise_summary.log

Or, reformatted:

#!/usr/bin/perl -F: -aln

next if m{failsafe.cf}i ;

my ($start,$finish) = split(",",$F[0],2) ;
my $duration = $finish-$start ;
my ($version) = m{version (.+) (} ;
my ($kept,$rep,$notrep) = m{Promises observed to be kept (d+)%, Promises repaired (d+)%, Promises not repaired (d+)%} ;

print join("t",$start,$duration,$version,$kept,$rep,$notrep)

This parses the file and outputs the relevant information tab-separated.

The command line explained

  • -a turns on the "autosplit" feature, where the input is split (normally at spaces) and the chunks are put in the @F array; in this specific case however, the "-F:" switch will make Perl split at colons ":";
  • -l will make it so that newlines are added after each print;
  • -n will wrap the code around a while loop, reading from the file(s) given as argument, and put each line in the $_ "default" variable;
  • -e will execute the code given on the command line;

The code explained

  • we skip the lines generated by the failsafe policy
  • we split the first field (timestamps), and calculate the difference (that is: the duration of the run)
  • we match the version string
  • we match the percentage for kept, repaired, and not repaired promises;
  • we print the bunch in tab-separated format

This is just the start, of course. E.g.: rather than tab-separated values one could print the values in a format suitable to rrdtool, and then use rrdgraph to create the graphs… But that's for later, take care for now. Ciao!

An eye on the clock

Recently I was trying to make sane a system clock that, for some reason, suddenly slowed down to a crawl. I started to fiddle with adjtimex, and I needed a way to verify the reaction of the clock itself, and ntpd’s. On one window, I had a watch ntpq -c pe -c as running. On another, there was an ntpdate -q in a loop. In a third one, I wanted to monitor the changes in frequency (set by ntpd) and ticks (set by me). I found this one-liner pretty useful:

# while true ; do adjtimex -p | perl -alne '/frequency/ and $f=sprintf("%3.2f%%",100*$F[1]/32768000) ; /tick/ and $t=$F[1]-10000 ; END { print scalar(localtime),"\tf=$f\tt=$t" }' ; sleep 30 ; done

This snippet assumes that the frequency tolerance of the clock is 32768000, and that the normal value for ticks is 10000. Check the output of adjtimex -p and the man page to verify this fits your system, too.

Detecting runaway processes

A simple one liner I used to take a stroll on a few systems, and see where I had a runaway backend process. This uses ps's command extended options (-o); I almost never use those, so I thought it was better to make a note to myself 🙂
That sort command could be crafted better, but it did its job in this case.

$ for SERVER in x{1..6} ; do echo $SERVER ; ssh -n $SERVER 'ps -C backend -o bsdtime,pid,comm | grep -v TIME | sort -rn | head -n 3' ; done 

I'd like to investigate how to manage such a situation automatically using CFEngine. Will try that sooner or later 😉