[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3ae3aa420901070742t8639479qe52cdb615bf46237@mail.gmail.com>
Date: Wed, 7 Jan 2009 09:42:53 -0600
From: "Linas Vepstas" <linasvepstas@...il.com>
To: mayer@....isc.org
Cc: david@...g.hm, "Robert Hancock" <hancockr@...w.ca>,
"Ben Goodger" <goodgerster@...il.com>,
"Kyle Moffett" <kyle@...fetthome.net>,
MentalMooMan <slashdot@...eshallam.info>,
"David Newall" <davidn@...idnewall.com>,
linux-kernel@...r.kernel.org, ntpwg@...ts.ntp.isc.org,
"Travis Crump" <pretzalz@...hhouse.org>, burdell@...ntheinter.net,
"Nick Andrew" <nick@...k-andrew.net>,
"Jeffrey J. Kosowsky" <jeff@...owsky.org>
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009
Thanks for the reply.
2009/1/7 Danny Mayer <mayer@....isc.org>:
> Linas Vepstas wrote:
>> 2009/1/6 Danny Mayer <mayer@....isc.org>:
>>> Why don't you tell us what the real problem is instead of telling us
>>> that you need TAI offset information?
>>
>> Currently, the Linux kernel keeps time in UTC. This means
>> that it must take special actions to tick twice when a leap
>> second comes by. Due to a (stupid) bug, some fraction
>> of linux systems crashed; this includes everything from
>> laptops to servers, to DVR's, to cell phones and cell
>> phone towers. There's now a fix for this.
>>
>> However, during the discussion, the idea came out that
>> maybe keeping UTC time in the kernel is just plain stupid.
>> So there's this idea floating around that maybe the kernel
>> should keep TAI time instead. The hope is that this will
>> reduce the complexity in the kernel, and push it out to
>> user space, "where it belongs" (to repeat a well-worn
>> mantra).
>>
>> However, *if* we were to kick UTC out of the kernel,
>> and push it to user-land, then, of course, there's a
>> different problem: how does the kernel know what the
>> correct TAI time is? As your reply makes abundantly
>> clear, NTP is not a good source for TAI information.
[...]
>> a discussion of a particular issue
>> that would arise if the kernel were to keep TAI -- if it did,
>> then user-space systems would need to have a reliable
>> source for leap-seconds. Since NTP does not
>> provide this, there was discussion about how that
>> could be worked-around. This then lead to the comment
>> that, "gee, wouldn't the right long-term solution be that
>> NTP provide TAI info?"
>
> NTP can provide leap-second information via an autokey protocol request,
> see Section 10.6 Leapseconds Values Message (LEAP)
> http://www.ietf.org/internet-drafts/draft-ietf-ntp-autokey-04.txt but
Yes, that look like exactly what would be wanted. It would be nice
if such a message was available in the regular, non-encrypted protocol.
> that means you need to have autokey set up with another NTP server and
> that means adding infrastructure that you probably don't want and are
> not prepared to handle.
Heh. Yes, well, I still haven't figured out how to secure DNS. Yet clearly
this whole security mess must march on, and somehow the security
infrastructure must eventually become easy to install.
>> Clearly, it would be a lot of work to get the kernel to keep
>> TAI instead of UTC, so this is not, at this time, a "serious
>> proposal". But if it were possible, and all the various
>> little issues that result were solvable, then it does seem
>> like a better long-term solution.
>>
>
> This is a *lot* more complicated than you might think. If you are
> thinking of implementing this similarly to the way timezone information
> is added for display purposes, you need the whole list of leap seconds
> and when the change happened since you now have to look at a timestamp
> and see when it was and then apply all of the leapseconds up to that
> point in time and none of the leapseconds beyond that. In addition, you
> have legacy files that have UTC timestamps on them so you would need to
> distinguish between UTC (legacy) and TAI timestamps in the file system
> among other places (anywhere where a timestamp exists) and what would
> you do about database tables which contain timestamps? The list goes on.
Yes.
> I'd much rather you spend the time tackling the clock interrupt losses
> that many of our Linux users complain about. See:
> https://support.ntp.org/bin/view/Support/KnownOsIssues#Section_9.2.4.
> for some of the gorier details. I'm sure you don't really want us
> recommending that they set HZ=100 in the kernel to alleviate the problem.
Actually, this is rather sorely lacking in 'gory details', rather, its
a complaint
that 'things don't work' with no discussion of the actual problem. It would
be much better if there was a link to any previous discussions on LKML on
this issue.
My knee-jerk reaction on reading about the lost-interrupts issue is that,
yes, setting HZ=100 and disabling ACPI is indeed a decent short-term
work-around (APIC is something completely different and not something
you can disable). The correct long-term solution would be to use real-time
kernels, which are designed to make sure that things like lost interrupts
never happen.
I have no idea what the status of real-time Linux is, whether it would now
have gaurantees for timer ticks, and whether anything there would now
be mergeable into the mainline kernel.
--linas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists