I had a problem to solve today. I have a bunch of files in a remote rsyncd repository, which I'll call "PuppetConf", and a number of them that I want to synchronize more often than the others, which I'll call "volatile files" or simply "volatile". Now, the problem is that volatile files, which used to reside in /volatile, now need to be located in other paths as well, and I needed a clever way to synchronize them alltogether without involving complex, long command line expressions.
After some experimenting with export, import, and filters, I finally found a way to do that. This is my filter file:
# This should be run with:
# rsync -zav --delete --filter="merge /path/to/this/file" rsyncdserver::PuppetConf /destination/path
#
# To understand what the patterns in this file means, see the rsync man
# page, in particular the section FILTER RULES.
# If you don't want to go through all that, please read at least the
# following excerpt.
#
# o if the pattern starts with a / then it is anchored to a
# particular spot in the hierarchy of files, otherwise it
# is matched against the end of the pathname. This is
# similar to a leading ^ in regular expressions.
# [...]
# o a '*' matches any non-empty path component (it stops at
# slashes).
#
# o use '**' to match anything, including slashes.
#
# [...]
# o if the pattern contains a / (not counting a trailing /)
# or a "**", then it is matched against the full pathname,
# including any leading directories.
# [...]
# o a trailing "dir_name/***" will match both the directory
# (as if "dir_name/" had been specified) and everything in
# the directory (as if "dir_name/**" had been specified).
# [...]
# Note that, when using the --recursive (-r) option (which is
# implied by -a), every subcomponent of every path is visited
# from the top down, so include/exclude patterns get applied
# recursively to each subcomponent's full name (e.g. to include
# "/foo/bar/baz" the subcomponents "/foo" and "/foo/bar" must not
# be excluded). [...] One solution is to ask for
# all directories in the hierarchy to be included by using a sin-
# gle rule: "+ */" (put it somewhere before the "- *" rule)
# Scan every directory, top down
+ */
# Transfer the top-level volatile directory, and everything under it
+ /volatile/***
# Transfer the top-level release file
+ /release
# Find all those paths which end with nodes, and transfer everything below it
+ nodes/***
# Don't copy anything else
- *
So, if I run this command I update just the volatile files. Good enough, but I still have a long command line here.
The next step was to slightly change the rsyncd configuration that Claudia made server side. Having this stanza in rsyncd.conf:
[Volatile]
path = /puppet
comment = Volatile files
read only = true
transfer logging = true
filter = merge /etc/rsyncd-filter.conf
log format = %a %h %o %f %l %b
log file = /var/log/rsyncd.log
where /etc/rsyncd-filter.conf is the same filter file you see above, shortens the rsync line to this:
rsync -zav rsyncdserver::Volatile /destination/path
and I like it a lot better 🙂
Note that I can't use the –delete option anymore, or I'd wipe everything but the volatile files off. This is not a big deal anyway, since I have to occasionally do a full sync, like:
rsync -zav --delete rsyncdserver::PuppetConf /destination/path
It's so nice to learn new things on the go 🙂