I had a problem to solve today. I have a bunch of files in a remote rsyncd repository, which I'll call "PuppetConf", and a number of them that I want to synchronize more often than the others, which I'll call "volatile files" or simply "volatile". Now, the problem is that volatile files, which used to reside in /volatile, now need to be located in other paths as well, and I needed a clever way to synchronize them alltogether without involving complex, long command line expressions.
After some experimenting with export, import, and filters, I finally found a way to do that. This is my filter file:
# This should be run with: # rsync -zav --delete --filter="merge /path/to/this/file" rsyncdserver::PuppetConf /destination/path # # To understand what the patterns in this file means, see the rsync man # page, in particular the section FILTER RULES. # If you don't want to go through all that, please read at least the # following excerpt. # # o if the pattern starts with a / then it is anchored to a # particular spot in the hierarchy of files, otherwise it # is matched against the end of the pathname. This is # similar to a leading ^ in regular expressions. # [...] # o a '*' matches any non-empty path component (it stops at # slashes). # # o use '**' to match anything, including slashes. # # [...] # o if the pattern contains a / (not counting a trailing /) # or a "**", then it is matched against the full pathname, # including any leading directories. # [...] # o a trailing "dir_name/***" will match both the directory # (as if "dir_name/" had been specified) and everything in # the directory (as if "dir_name/**" had been specified). # [...] # Note that, when using the --recursive (-r) option (which is # implied by -a), every subcomponent of every path is visited # from the top down, so include/exclude patterns get applied # recursively to each subcomponent's full name (e.g. to include # "/foo/bar/baz" the subcomponents "/foo" and "/foo/bar" must not # be excluded). [...] One solution is to ask for # all directories in the hierarchy to be included by using a sin- # gle rule: "+ */" (put it somewhere before the "- *" rule) # Scan every directory, top down + */ # Transfer the top-level volatile directory, and everything under it + /volatile/*** # Transfer the top-level release file + /release # Find all those paths which end with nodes, and transfer everything below it + nodes/*** # Don't copy anything else - *
So, if I run this command I update just the volatile files. Good enough, but I still have a long command line here.
The next step was to slightly change the rsyncd configuration that Claudia made server side. Having this stanza in rsyncd.conf:
[Volatile] path = /puppet comment = Volatile files read only = true transfer logging = true filter = merge /etc/rsyncd-filter.conf log format = %a %h %o %f %l %b log file = /var/log/rsyncd.log
where /etc/rsyncd-filter.conf is the same filter file you see above, shortens the rsync line to this:
rsync -zav rsyncdserver::Volatile /destination/path
and I like it a lot better 🙂
Note that I can't use the –delete option anymore, or I'd wipe everything but the volatile files off. This is not a big deal anyway, since I have to occasionally do a full sync, like:
rsync -zav --delete rsyncdserver::PuppetConf /destination/path
It's so nice to learn new things on the go 🙂