linux-kernel - Re: /proc/stat btime accuracy problem

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <BANLkTimuPrqN1euyOqAGm2m4Ea1PdbrzDQ@mail.gmail.com>
Date:	Wed, 1 Jun 2011 18:31:37 -0600
From:	Bjorn Helgaas <bhelgaas@...gle.com>
To:	john stultz <johnstul@...ibm.com>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: /proc/stat btime accuracy problem

On Wed, Jun 1, 2011 at 5:58 PM, john stultz <johnstul@...ibm.com> wrote:
> On Wed, 2011-06-01 at 17:35 -0600, Bjorn Helgaas wrote:
>> On Wed, Jun 1, 2011 at 4:35 PM, john stultz <johnstul@...ibm.com> wrote:
>> > On Wed, 2011-06-01 at 14:50 -0600, Bjorn Helgaas wrote:
>> >> timekeeping_init() basically does the following:
>> >>
>> >>     xtime = RTC
>> >>     if (arch implements read_boot_clock())
>> >>         wall_to_monotonic = -read_boot_clock()
>> >>     else
>> >>       wall_to_monotonic = -xtime
>> >>
>> >> So wall_to_monotonic records some approximation of the system boot
>> >> time, which is then used to derive the "btime" reported in /proc/stat.
>> >>
>> >> The problem I'm seeing is that xtime is updated on timer ticks, so
>> >> uninterruptible code, like kernel serial printk, makes us miss ticks,
>> >> so xtime falls behind the RTC.
>> >
>> > Huh. So this sort of issue was common back when we had tick-based
>> > timekeeping (in combination with troubled hardware), but with the
>> > current clocksource based timekeeping, occasional lost ticks shouldn't
>> > really effect time.
>>
>> Makes sense.  Your presentation here was a great help:
>>   http://sr71.net/~jstultz/tod/ols-presentation-final.pdf
>>
>> > Can you explain a bit more about what kind of hardware this is happening
>> > on, and what clocksource is being used?
>>
>> Sure.  This is an x86 box.  Normally we're using the TSC clocksource,
>> and I don't think the issue happens after that.  I guess my
>> experimentation so far has been with uninterruptible time before we
>> register *any* clocksource (or at least before I see any "Switching to
>> clocksource" messages).
>
> Huh.
>
> So yea, if we are very early at boot, we're likely using the jiffies
> clocksource, which is basically a software-based tick counter, which
> would be prone to lost-ticks issues if irqs were disabled for too long.
>
> Do you know if this is this a relatively new issue?
>
> My first instinct is "don't do that!" to whatever driver is disabling
> irqs for so long. Do you know what's actually causing these long irq off
> periods?
>
> I assume you're noticing this offset by seeing that CLOCK_REALTIME is
> off from the RTC right after boot? How severe is this? The RTC read is
> only second granular, so there's a fair amount of error (~1 second)
> possible right at boot, so this then must be many seconds worth of lost
> ticks to be noticeable, right?

I'm using 2.6.34, so not really new.  I think the major offender is
kernel serial printk, which is done in polled mode.  A *lot* of it,
e.g., 30+ seconds' worth.  I wonder if there's some reasonably clean
way to resync with the RTC, say at the time we register a clocksource
better than jiffies, or in clocksource_done_booting(), or something.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/