lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTimP1F=GsnEqp-koTdy4jfaxzKpetC3Suo1Yidcm@mail.gmail.com>
Date:	Wed, 10 Nov 2010 15:56:10 -0500
From:	Andrew Lutomirski <luto@....edu>
To:	Thomas Gleixner <tglx@...utronix.de>
Cc:	Borislav Petkov <bp@...64.org>,
	Clemens Ladisch <clemens@...isch.de>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
	x86 <x86@...nel.org>,
	Jörg Rödel <joerg.roedel@....com>
Subject: Re: [FALSE ALARM] Re: HPET (?) related hangs and breakage in 2.6.35,36

On Wed, Nov 10, 2010 at 3:52 PM, Thomas Gleixner <tglx@...utronix.de> wrote:
> On Wed, 10 Nov 2010, Andrew Lutomirski wrote:
>
>> On Wed, Nov 10, 2010 at 1:50 PM, Borislav Petkov <bp@...64.org> wrote:
>> > On Wed, Nov 10, 2010 at 01:48:00PM -0500, Andrew Lutomirski wrote:
>> >> > Clocksource: tsc unstable (delta = -34355296774 ns)
>> >> > Switching: to clocksource hpet
>> >>
>> >> Please disregard -- this is a bug in nouveau (or drm) not hpet.  I'll
>> >> send a bug report to the maintainers.
>> >
>> > Interesting! Joerg was complaining about similar symptoms with .36 today
>> > too.
>>
>> Well, there is a clocksource sort-of-bug that could cause confusion:
>> when something totally unrelated to clocksources goes out to lunch,
>> the clocksource watchdog decides that the clocksource is unstable and
>> complains, steering everyone toward filing the wrong bug.
>
> How should the clocksource watchdog code know that something went to
> lunch? The fact that we need to monitor TSC at all is horrible enough,
> adding further heuristics to detect extended lunch breaks would be
> just a PITA.

Could the clocksource watchdog detect when it gets woken up (i.e. when
the hardware timer fires) instead of when it gets scheduled?  It is
internal to the timing code, after all...

Alternatively, maybe the watchdog could just compare the TSC timestamp
to the current value according to some other clock (PIT?  whatever
clockevent is in use?) instead of just using the time passed into the
delayed work in the first place.

>
> Maybe we could print a different warning when we see large negative
> deltas, which is the main indicator for the system being stuck for
> quite a time while TSC advances happily.

That would be an easy fix.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ