No step back!

Summary

We'll have a leap second insertion at the end of June this year. My tests and other evidence shows that the Linux kernel in Debian Squeeze (but possibly in newer versions of the Kernel, too, I didn't check the sources) doesn't handle the leap second in the best way, and steps back one second around midnight between June 30th and July 1st. This is a big pain in the… grass 🙂 for time-sensitive applications (e.g.: clusters), or those that rely on a monotonic clock (usually RDBMSs; e.g. older versions of Oracle crashed badly, dunno about newer).

I am trying to find feasible workarounds for this, and I'll keep you posted. …Long version

We'll have a leap second insertion at the end of June this year. This means that the last minute of June 30th will have 61 seconds instead of the usual 60. This happens to compensate the rotation speed of the Earth slowing down. So, on June 30th, we'll have:

23:59:58
23:59:59
23:59:60
00:00:00 (July 1st)

Good operating systems should either implement the leap second correctly by supporting the second number 60, or byfreezing/slowing down the clock when between 23:59:60 and 00:00:00.

POSIX does not have provisions for the 61st second, so in this respect UNIX/POSIX systems aren't usually in the "good operating systems" category. It seems they are in good company, by the way 😦

Linux seemed to support the freeze/slowdown approach once, but it was eventually removed. So what happens today is that the clock steps back one second when midnight approaches.

This is a big pain in the grass 😛 for time-sensitive applications (e.g.: clusters), or those that rely on a monotonic clock (usually RDBMSs, e.g. older versions of Oracle crashed badly, dunno about newer).

Many have developed their approach in an attempt to mitigate the bad effects of the leap second on systems (e.g.: Google), but none is trouble free: potentially, even Google's approach could screw up.

I am now working to find an acceptable workaround that could work for us at $WORK. And yes, that may screw up as well.

So far, I have a client where I managed to avoid a step change, and in one hour has reduced its offset from ~1 second to 35 milliseconds. This may work with services hosted on a single server, but we still need an approach for clusters.

I'll let you know how it develops.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s