After one month spent on other high priority tasks it was about time to get back to the leap second lab. The fated day is coming and we need to have a strategy in place.
I spent this week running tests, tuning the scripts that support them, and improving the CFEngine policies that manage the lab today and will implement our strategy tomorrow. Besides, I structured my tests a bit better to ensure that the “false start” I had one month ago doesn’t happen again.
On Friday I finally got to run some crucial tests and the results of one of them were scary to say the least.
How does the kernel deal with the leap second?
In its normal mode of operations, ntpd gets a warning from its references that a leap second is coming up to 28 days in advance. When 23 hours or less are left to the leap second insertion, ntpd arms the leap second in the kernel, which means that it will be the kernel, and not ntpd, to take action when the time comes. That considered, the question “how does the Linux kernel handle the leap second?” is a good one to ask first. Stepping the clock back one second when it’s reaching 23:59:60 is what the Linux kernel has done historically, but there are other ways: one is to stop the clock at 23:59:59 and incrementing the time as little as possible whenever the kernel is requested the current time and until the actual time reaches 00:00:00 (see Mills’ The NTP Timescale and Leap Seconds); Google use their “leap smearing” technique; and, of course, one could support it properly by just admitting that a time like 23:59:60 can exist but that’s another story…
I ran this first experiment on one Debian Squeeze machine, one Debian Jessie and two Debian Wheezy:
- stop ntpd
- disable CFEngine to ensure that the system won’t change during the test
- set the system date to June 30th, 23:45 UTC
- set the hardware clock to the system date
- arm the leap second in the kernel using adjtimex -S 17
- check that the leap second is actually being armed
- start logging the system time and see if it steps back
This experiment confirmed across all the three distributions that Linux implements the leap second insertion by stepping the clock back of one second when it’s close to 23:59:60. So far, so good.
How does ntpd deal with the leap second (if one forces it not to step)?
The previous experiment showed how the Linux kernel handles the leap second. It was now time to look at the other side of the coin. I equipped ntpd on the same four machines with a leap second file, so ntpd knew it was due to insert a leap second, and configured it to refuse to step (using the directive tinker step 0
). That alone would have been enough to disable the kernel discipline, anyway I also added a disable kernel
directive for good measure. Besides, since I couldn’t use any external reference clock (because today is not June 30th, right?) I configured ntpd to follow the machine’s hardware clock.
Note that understanding the behaviour of ntpd with the kernel discipline disabled is important by itself: in 2012 one of the bugs that caused meltdowns on various sites was a kernel bug: putting the kernel out of the game would rule out any new kernel bugs that may be hanging in the darkest corners of the code.
This is how the test was set up:
- stop ntpd
- disable CFEngine to ensure that the system won’t change during the test
- set the system date to June 30th, 23:45 UTC
- set the hardware clock to the system date
- deploy the test configurations on the policy hub
- force a CFEngine run on the test machines while leaving the agent disabled: that will update the ntpd configuration with test parameters and start ntpd
- start logging the system time and see if it steps back
The result was again as expected: on all the test machines the clock did not step back: ntpd didn’t take the initiative (despite the fact that it knew about the leap second from the leap second files), nor did the kernel.
How does ntpd deal with the leap second (if one allows it to step)?
There was one more question to be answered. What happens if one allows ntpd to step (no tinker step 0
directive) but still leaves the kernel discipline disabled? Well, here the results were indeed surprising. And scary: the clock misbehaved on all the three distributions. Badly. The clock was stepped more than once and eventually the output of the date
command was corrupted, which doesn’t really make for good hopes. But let’s take first things first.
The structure of the experiment was exactly the same as the previous one, the only difference in configuration was that the clock was free to step and the kernel discipline was still disabled. When I looked at the logs of the system time I noticed there was something quite odd: at one point they all were cluttered with non printable characters (0x00).
When I cleaned up the output with strings
I had the impression that more than one step had happened, but with 9000-15000 lines per log it was better to leave it to a program to scan them. I wrote a very small perl script to parse the log and report about steps backwards. The results were appalling.
On squeeze:
STEP 20150630-235959.857825602 => 20150630-235958.858608798 at line 1151 STEP 20150701-000002.000473222 => 20150630-235959.000005710 at line 5365 STEP 20150630-235959.857826516 => 20150630-235958.858561638 at line 6515
The clock steps back a first time right before the leap second insertion, 1 second. Then it reaches midnight+2s and it steps back again, 3 seconds; then one more second back when approaching midnight again. Scary, and it’s not the worst case. Look at what happened in one of the Wheezy nodes:
STEP 20150630-235959.209322358 => 20150630-235958.210291287 at line 250 STEP 20150701-000001.948352999 => 20150701-000001.909922103 at line 4645 STEP 20150701-000002.000660485 => 20150630-235959.000375084 at line 4766 STEP 20150630-235959.209326034 => 20150630-235958.210291542 at line 5015
Four steps. One second before 23:59:60, then about one tenth right before 00:00:02, then three more seconds back, then one more second approaching midnight again.
Jessie was no exception:
STEP 20150630-235959.715676017 => 20150630-235958.716289939 at line 1416 STEP 20150701-000002.000233661 => 20150630-235959.000148722 at line 7731 STEP 20150630-235959.715474306 => 20150630-235958.716981612 at line 9146
One second step back right before 23:59:60, then three seconds at 00:00:02, then one more.
Take-aways
What’s to take away from this last experiment? Well, quite a lot actually:
- the configuration with the kernel discipline disabled and ntpd free to step back the clock seems not recommendable. However, more tests are needed to understand if the clock behaves when the reference clock is another NTP server instead of a local clock;
- there are conditions where the
date
command returns garbage: where does that garbage come from? It can be a bug in the command itself of course, or it can be the system that returns crap when asked the time. In the latter case, that could likely trigger bugs in all the software that doesn’t expect something like that to happen (nearly all???); - if these conditions are unavoidable, it may be desirable to keep the kernel discipline active so to be sure that the clock will step once and no more, or otherwise disable ntpd and disarm the leap second in the kernel and then force ntpd to slowly recover the offset afterwards rather than risking three/four steps back (or maybe more? who knows?) that will completely clutter the system time and confuse the system and the applications repeatedly.
I’ll run more experiments next week, be sure I’ll keep you posted.
Pingback: A humble attempt to work around the leap second, 2015 edition | A sysadmin's logbook