Too late to fix it

We installed a new datacenter right before I went on vacation, and of course we set up an NTP synchronization subnet there. As always, we configured four NTP multicast servers, and the rest as clients (we are talking about several hundreds of servers). The servers were running Debian Linux "Squeeze", while the vast majority of the clients was running Debian Linux "Lenny". For reasons I am not going to discuss here, using Squeeze instead of Lenny is not an option.

Right after the configuration was done, I noticed a really odd thing: all clients displayed a poll interval of 1024 seconds for one of the servers. This is just nonsense, as each server sends an NTP packet every 64 seconds. Anyway, the clients were in good sync so I decided I would investigate this after my vacation. And so I did. …When I came back, the situation hadn't changed.

The first thing I did was to use tcpdump on both server and clients to verify the packets were correctly sent and correctly received. And they were. We tried also to activate the built-in statistics in ntpd, and specifically peerstats. Unfortunately, the problem just disappeared restarting ntpd. That could be enough but… why the hell this was happening in the first place?

We decided to investigate further, and we used strace to compare what was going on on a well behaved server and on a bad behaved one. A comparison of the two traces held to a surprising result. Despite of being configured as multicast clients, the bad ones were behaving as unicast clients to one of the servers!

That was definitely too much. So I wrote an email to the questions@ntp.org mailing list. I had some back-and-forth with developers, until that abruptly came to an end: Lenny's ntpd version (4.2.4) is no more supported, and they decided they didn't want to help us on that.

Fair enough. I am still asking myself how this guy could claim that a bug which was not identified in 4.2.4 was fixed in the stable 4.2.6 version, but I'll have to live with that. It's too late to have 4.2.4 fixed upstream.

Good luck, Lenny users.

CREDITS: Photo downloaded from here, and edited to get better colors and a fuzzy border around it.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s