Location detection in cfengine

My employer provided me with a laptop that I usually use when working remotely (whether on a VPN or not); more rarely, that laptop could also be connected to the company network in Oslo. Usually, when working remotely I am either at home in Oslo, or in one of my relatives' house in Sardinia.

Depending on where I am, the laptop's configuration needs to be changed or augmented, in particular for these services:

  • ntpd
  • DNS resolver
  • software updates

Being an NTP junkie, I want my ntpd configured in the best possible way, and no: I am not going to point to some random servers in the pool that could be on the other side of the world. Whether I am in Italy or in Norway, I want to use servers from the pool that are either in the same country, or in a country nearby.

When I am in VPN, the client decides for me the search list for DNS domains: I want that list to be enriched with all the domains I actually need.

When I am on a low bandwidth network, I want the system to refrain from automatically checking for system updates: it's much better if I do it manually when, for example, I am not going to rot my mother's skype calls.

For all these reasons, I decided to implement a simple location detection process in cfengine that keeps my laptop correctly configured wherever I am. Besides, since the agent runs every 5 minutes, I can be sure that the configuration changes (e.g. when I enter the VPN) will take place quickly.

…and there could be more if you think about it! E.g.: what about having specific firewall rules depending on where you are? If you are in the office, you may not care if someone tries to access your laptop via SSH as that someone could be… you 😉 But suppose you were not able to determine your current location: would you leave free access to your laptop? …What is "location detection"?
"Location detection" is a process that aims to calculate the position of an object. In our case, the object to be located is the computer running the process.

Why is that relevant? Well, in the case of a desktop computer (or of a computer that always used in the same place), it is not. In fact, the process is relevant only in those cases where a computer can be moved and used in different places.

Don't get me wrong: I am not talking of anything sophisticated. What I am going to talk about has nothing to do with GeoIP, GPS, or anything that can help you locate something with a one-meter resolution. The problem I am trying to solve is much simpler.

A word of caution
My policies are not designed to be "general" or immediately reusable: they are designed for my laptop. For example, they assume that the network interfaces are called eth0 (wired) and wlan0 (wireless), and they are never used together. I assume the laptop is always on a private network and NAT'ed to a public IP. And they don't cover all the possible cases (it's not GeoIP, as said), only my usual locations.

My location detection process
It is a simple process indeed:

  • if we are on wifi, use the ESSID of the network to understand where we are;
  • if we are on a wired interface, get information about our public IP (if possible); retain that information for one hour, or until our private IP changes;
  • if on VPN, don't modify the location (that is: don't change the configuration as if we were at the VPN remote endpoint), but change the few bits that are needed

The last step is not actually part of the location detection process, actually: it just ensures that the location detection process is not fooled by the activation of a VPN tunnel.

Each of the first two steps needs a cfengine module to work. Let's see them.

#!/bin/bash

PATH="/sbin:/usr/sbin:/usr/bin:/bin"

for ESSID in $( iwgetid --raw )
do
  CESSID=$( echo $ESSID | tr -c "a-zA-Z0-9_" "_" | sed -e 's/_*$//' )
  CLASS="essid_${CESSID}"

  echo "+${CLASS}"
done

exit 0

This module uses the iwgetid command to get a list of the ESSID of the wireless networks this machine is connected to (usually one). Each network name is prepended the string "essid_" and canonified, and finally printed with a "+" in front. Upon reading the script output, cfengine will define a class for each detected ESSID.

With those classes defined, is an easy task for cfengine to define other classes:

      "location_sg_casa" expression => "essid_CasaMarongiu",
        comment => "Network at my parents' house, wifi" ;

      "location_sg_sanna" expression => "essid_SannaNet",
        comment => "Network at my parents-in-law's house, wifi" ;

      "location_sg_rosybia" expression => "essid_wifi_sanna",
        comment => "Another network at my parents-in-law's house" ;

or to define variables…

    location_sg_casa::
      "location" string => "Italy/San Gavino/Casa Marongiu",
        policy => "overridable" ;

…and classes using those variables:

      "location_italy" expression => regcmp("Italy/.*","$(location)") ;

In the end, it is easy for cfengine to tell if it knows about this location or not:

      "defined_location" expression => classmatch("location_.+") ;

…and if we are on a broadband network or not:

    defined_location::
      "is_broadband"
        or => {
                "location_oslo",
                "location_opera_oslo",
                "location_opera_generic",
              } ;

But what if we are on a wired network and NAT'ed? Our private address will tell us nothing about our location — most likely, we'll be in one of the ubiquitous 192.168.x.x private networks. We'll need to detect our external, public address to know more, and do that in a polite way. We'll use one polite module, and we'll ask cfengine to use it even more politely.

Here's the module:

#!/usr/bin/perl

use strict ;
use warnings ;

use constant DEBUG => 0 ;

use Socket ;
use LWP::UserAgent ;
use List::Util qw{shuffle} ;

my @exip_servers = map { "api-${_}01.exip.org" } qw{sth ams nyc} ;
my $ua = LWP::UserAgent->new ;

$ua->timeout(1) ;
my $ip ;

foreach my $server (shuffle @exip_servers) {
    print STDERR "Trying server $servern" if DEBUG ;
    my $url = "http://$server/?call=ip" ;
    $ip = $ua->get($url)->content ;

    last if defined $ip ;
}

exit 1 if not defined $ip ;

my $name = gethostbyaddr(inet_aton($ip),AF_INET) ;
my ($second_level_domain) = ( $name =~ m{([^.]+.[^.]+)$} ) ;

my $ip_class     = qq{external_ip_$ip} ;
my $domain_class = qq{domain_$second_level_domain} ;

$ip_class     =~ tr{.}{_} ;
$domain_class =~ tr{.}{_} ;

print qq{=external_ip=$ipn} ;
print qq{=external_fqdn=$namen} ;
print qq{+$ip_classn} ;
print qq{+$domain_classn} ;

We first ensure some strictness in the perl compiler, and we define a constant named DEBUG (how do we use it? The answer is an exercise left to the reader).

We import three modules: Socket to, e.g., provide the function inet_aton or the constant AF_INET; LWP::UserAgent, to provide an HTTP user agent; and the function shuffle from List::Util.

We then define the list @exip_servers (I could write the full names manually, but I am lazy). And, finally, we create a user agent ($ua) that will timeout if it doesn't get an answer by a second. After that, we prepare an undefined variable ($ip) to hold our public address.

Then we have a foreach cycle that will iterate through the names of the three EXIP servers, but the list is shuffled first so that, on average, we distribute our calls evenly across all the three.

Inside the cycle, we try to get our IP address by calling the method get over the $ua object, and then the content method over the object returned by get. If the call is successful, $ip will hold the IP value, and the cycle will end right there and then. If the call is unsuccessful, we'll try another server.

If, when the cycle is over, $ip is still undefined, we exit with a return code of 1. If we get past, then we extract the FQDN associated to the IP, if any, and we use a regular expression to extract the second level domain.

We are almost ready to return our output: we use our IP and second level domain to compose two class names; then, we canonify these class names by changing all the dots into underscores. Finally, we return the whole lot of information as variables (external_ip, external_fqdn) and classes.

Now cfengine has some information it can use, e.g. in a classes promise:

      "location_oslo"   or => { "essid_WiFiOslo","domain_lyse_net" },
        comment => "My network at my place in Oslo" ;

However, calling this module every time the agent run would be very unpolite to exip, so we want to cache this information for at least one hour.

There are several ways to do that. For example, we could persist the final location_* class for 60 minutes, but if we change location during those 60 minutes (e.g. I move from my parents' house to my parents-in-law's) I would need to cancel the persistent class and define a new one, otherwise I would end up with two overlapping location classes. Unfortunately, cancelling classes in cfengine is a bit unpractical, in that it must happen in a classes clause, or via the command line.

To avoid such headaches, I took a… well, imaginative approach. Let's see the code (purged from non-disclosable parts, of course)

bundle agent location
{
  vars:
    any::
      "internal_ip" string => canonify("$(sys.ipv4)") ;

      "location" string => "?/?/Undefined",
	policy => "overridable" ;

      "cacheable_locations"
	comment => "Locations we can cache in a module",
	slist   => {
		   "location_opera_oslo",
		   "location_italy",
		   "location_oslo",
		 } ;

      "globalizable"
	comment => "Classes to be globalized if defined",
	slist => {
		   "defined_location",
		   "is_broadband",
		   @(cacheable_locations),
		 } ;

    location_oslo::
      "location" string => "Norway/Oslo/Home",
	policy => "overridable" ;

    location_sardinia::
      "location" string => "Italy/?/?",
	policy => "overridable" ;

    location_sg_casa::
      "location" string => "Italy/San Gavino/Casa Marongiu",
	policy => "overridable" ;

    location_sg_sanna::
      "location" string => "Italy/San Gavino/Casa Sanna",
      policy => "overridable" ;

    location_sg_rosybia::
      "location" string => "Italy/San Gavino/Rosy e Bia",
      policy => "overridable" ;

    location_decimo::
      "location" string => "Italy/Decimomannu/Home",
	policy => "overridable" ;

    location_opera_oslo::
      "location" string => "Norway/Oslo/Office",
	policy => "overridable" ;

    location_opera_generic::
      "location" string => "?/?/Office",
	policy => "overridable" ;



  classes:
    net_iface_eth0::
      "cached_ip"        expression => "current_ip_$(internal_ip)" ;

    any::
      # Location-related classes
      "cached_location"
	expression => fileexists("$(site.lmodules)/cached_location.sh") ;

      "defined_location" expression => classmatch("location_.+") ;

      "location_oslo"   or => { "essid_WiFiOslo","domain_lyse_net" },
        comment => "My network at my place in Oslo" ;
      
      "location_sg_casa" expression => "essid_CasaMarongiu",
	comment => "Network at my parents' house, wifi" ;

      "location_sg_sanna" expression => "essid_SannaNet",
	comment => "Network at my parents-in-law's house, wifi" ;

      "location_sg_rosybia" expression => "essid_wifi_sanna",
	comment => "Another network at my parents-in-law's house" ;

      "location_sardinia" expression => "domain_tiscali_it",
        comment => "Probably in Sardinia, somewhere" ;

      "location_decimo" expression => "essid_WiFiBronto",
        comment => "My network at my place in Decimomannu (phasing out)" ;

      "location_opera_oslo"
	or => {
		"essid_opera_guest",
		"sysadmin_net",
		"guest_net_oslo",
		"guest_net_other",
	      },
        comment => "My network in the office" ;

      "location_opera_generic" expression => "domain_opera_com",
	comment => "In Opera, somewhere..." ;

      "location_italy" expression => regcmp("Italy/.*","$(location)") ;

      # Should we bother exip to find out about our location?
      "detect_external_ip"
        expression => "net_iface_eth0.!cached_ip" ;

      # When should we use a cached location: when:
      # * we are on a wired interface, and
      # * we don't have to detect an external IP, and
      # * a location is not yet defined
      #
      # Note also that the following expression:
      #
      # net_iface_eth0.!detect_external_ip.!defined_location
      #
      # actually expands to:
      # net_iface_eth0.!(net_iface_eth0.!cached_ip).!defined_location =>
      # net_iface_eth0.(!net_iface_eth0|cached_ip).!defined_location =>
      # net_iface_eth0.cached_ip.!defined_location
      #
      # However, we keep the first expression as it is more... expressive ;)
      
      "use_cached_ip"
	expression => "net_iface_eth0.!detect_external_ip.!defined_location" ;

      # If we are on wifi, we can use the ESSID to understand where we
      # are, we don't need to cache our internal IP and do external IP
      # detection. We cache the internal IP only when we are on the
      # wired interface
    net_iface_eth0::
      "current_ip_$(internal_ip)"
	ifvarclass  => "!cached_ip",
	expression  => "any",
	persistence => "60" ;

    defined_location::
      "is_broadband"
	or => {
		"location_oslo",
		"location_opera_oslo",
		"location_opera_generic",
	      } ;

  files:
    any::
    "$(site.lmodules)/essid.sh"
      perms => mog("0755","root","root") ;

      "$(site.lmodules)/location_via_exip.pl"
      perms => mog("0755","root","root") ;

    detect_external_ip.defined_location::
      "$(site.lmodules)/cached_location.sh"
	perms         => mog("0755","root","root"),
	edit_defaults => empty,
	edit_line     => cache_location,
	create        => "true" ;

    detect_external_ip.!defined_location::
      "$(site.lmodules)/cached_location.sh"
	comment => "Remove stale location information",
	delete  => tidy ;
      

  packages:
      "libwww-perl"
	comment        => "Install libwww-perl to be used in a module",
	package_policy => "add",
	classes        => if_ok("libwww_installed") ;

  methods:
      "globalize_$(globalizable)"
	usebundle  => globalize_class("$(globalizable)"),
	ifvarclass => "$(globalizable)" ;


  commands:
    net_iface_wlan0::
      "$(site.lmodules)/essid.sh"
	module => "true" ;

    libwww_installed.detect_external_ip::
      "$(site.lmodules)/location_via_exip.pl"
	module => "true" ;

    use_cached_ip.cached_location::
      "$(site.lmodules)/cached_location.sh"
	module => "true" ;

  reports:
    detect_external_ip.report_always::
      "Detected public IP: $(location_via_exip_pl.external_ip)" ;
      "Detected public FQDN: $(location_via_exip_pl.external_fqdn)" ;

    report_verbose::
      "Detected location: $(location)" ;

}


bundle edit_line cache_location
{
  insert_lines:
      "#!/bin/bash" ;
      "" ;
      "echo "+$(location.cacheable_locations)""
	ifvarclass => "$(location.cacheable_locations)" ;
}

Let's go through the tree passes the agent does on this code when it is run for the very first time. The case when we are on a wifi connection is trivial (we run the module, detect the ESSID, and set the classes accordingly), so we'll concentrate on the case when we are on a wired connection, and our IP is 192.168.10.101

First pass
We start with vars promises, defining a number of variables. internal_ip will hold the canonified version of our current IP (192_168_10_101); location will hold a default value; cacheable_locations holds the class names we need to cache when they are defined; globalizable is a list of classes that we can make global using a little trick we'll see at the end (just trust me for now). Then, comes a series of promises to set the location variable, depending on where we are: at the first pass they are all skipped.

Then come the classes promises. The cached_ip class is defined only if the current_ip_192_168_10_101 is defined: since, at the moment, it is not set, cached_id is also not set. The cached_location class won't be defined either, as the cached_location.sh script does not exist yet'. defined_location is not set, too: no class starting with location_ is set, nor it will be at this pass. detect_external_ip is set, as we are on eth0 and cached_ip is not set. use_cached_ip is false, since it's a chain of logical ANDs, and !detect_external_ip is false. The next promise will set the current_ip_192_168_10_101 class, and will persist it for 60 minutes, and it will be the last classes promise to be evaluated at this pass.

Then come the files promises. The first two ensure that the two modules presented above are executable; the first promise on cached_location.sh is not evaluated because defined_location is not set. The following one is, however, and ensures that the cached_location.sh module is not present.

The only packages promise will ensure the libwww-perl package is installed — we need it because our perl module will use it.

None of the methods promises is evaluated, as all the classes in @(globalizable) are not set.

Of all the commands promises, only the second is evaluated. The module location_via_exip.pl is executed, to set two variables and two classes; in particular, let's suppose the class domain_lyse_net is set by this module.

The reports promises will write the information we retrieved via location_via_exip.pl, and this will end the first pass.

Second pass
Let's go through the second pass, and see what changed. For the vars promises, everything stays the same. For the classes promises we have a number of changes instead. First: cached_ip is now defined; then, a few promises below, the class location_oslo is also defined; the class detect_external_ip is now undefined, because cached_ip is defined; use_cached_ip stays undefined, as defined_location is not defined yet. The rest doesn't change.

Nothing changes for files, packages, methods, and reports promises, and the second pass ends here.

Third pass
Things change, finally, at the third pass. First and foremost, notice that one of the classes in cacheable_locations, and hence in globalizable, is defined (location_oslo): keep that in mind. Since location_oslo is defined, the variable location changes its value to "Norway/Oslo/Home". Nothing more happens here, and we go through classes promises.

Something changes here, too: now defined_location is defined (because location_oslo is defined); use_cached_ip is also set, and so is is_broadband. No other classes promises change, and we step into files promises.

In fact, we have a change in files promises, too: now the module cached_location.sh is created, using the cache_location bundle. Let's examine it.

The cache_location bundle is very simple indeed. It creates a bash script, in which it inserts a line

echo +class_name

for each defined class in the list cacheable_location. In this case, the resulting script will be:

#!/bin/bash

echo "+location_oslo"

And back to the files promises, just to know that no others will be executed. packages promises are already verified and won't be executed again. But something changes in methods. In fact, since defined_location, is_broadband, and location_oslo are defined, the bundle globalize_class will be executed for each of these values, and that will make the three classes global. How does it happen? A bit more patience, please, let's finish this.

commands promises won't be executed again, too. In fact, the only place where we had a change is the class use_cached_ip, but cached_location is still false. reports promises won't be executed again either, and there we are.

Next run
In this first run, we were able to detect our location and to use it. However, what happens in the following run (about 5 minutes later)? Let's make a quick stroll:

First pass: cached_ip, cached_location, and use_cached_ip are defined. Because of that, the cached_location.sh module (created in the previous run) is executed, and defines location_oslo class!!! From there, everything goes pretty much like in the previous agent run, but notice that we have detected our location without calling EXIP services this time! How long will this last? Well, as long as the detect_external_ip class is unset, that is: as long as we are on a wired interface and the class cached_ip is true. In turn, cached_ip will be undefined when either we change our internal address, or when the persistence of cached_ip_192_168_10_101 expires, that is: 60 minutes. As promised, we won't try to call EXIP services for at least one hour!

The globalize_class bundle
And now, as promised, the globalize_class bundle:

bundle common globalize_class(class)
{
  classes:
      "$(class)" expression => "any" ;
}

Why do we use it? Because we need to make the location information available to all the other bundles, and the only way (I could see) is to make them global. This bundle has a side effect though: since it is a common bundle, it is parsed right at the beginning of the run, which in turn will define a class named… __class_. But I can live with that.

Have fun!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s