linux-kernel - Re: [PATCH] clocksource, prevent overflow in clocksource

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4F7CF094.5020201@us.ibm.com>
Date:	Wed, 04 Apr 2012 18:08:36 -0700
From:	John Stultz <johnstul@...ibm.com>
To:	Prarit Bhargava <prarit@...hat.com>
CC:	linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
	Salman Qazi <sqazi@...gle.com>, stable@...nel.org
Subject: Re: [PATCH] clocksource, prevent overflow in clocksource_cyc2ns

On 04/04/2012 11:33 AM, Prarit Bhargava wrote:
>> One idea might be to replace the cyc2ns w/ mult_frac in only the watchdog code.
>> I need to think on that some more (and maybe have you provide some debug output)
>> to really understand how that's solving the issue for you, but it would be able
>> to be done w/o affecting the other assumptions of the timekeeping core.
>>
> Hey John,
>
> After reading the initial part of your reply I was thinking about calling
> mult_frac() directly from the watchdog code as well.
>
> Here's some debug output I cobbled together to get an idea of how quickly the
> overflow was happening.
>
> [    5.435323] clocksource_watchdog: {0} cs tsc csfirst 227349443638728 mask
> 0xFFFFFFFFFFFFFFFF mult 797281036 shift 31
> [    5.444930] clocksource_watchdog: {0} wd hpet wdfirst 78332535 mask
> 0xFFFFFFFF mult 292935555 shift 22
>
> These, of course, are just the basic data from the clocksources tsc and hpet.

If I'm doing the math right, these are ~2.7 Ghz cpus?

So what kernel version are you using?

In trying to reproduce this locally against Linus' HEAD on a much 
smaller system (single core + HT 1.6Ghz), I got:
[    6.611366] clocksource_watchdog: {0} cs tsc csfirst 36177888648 mask 
ffffffffffffffff mult 10485747 shift 24
[    6.611596] clocksource_watchdog: {0} wd hpet wdfirst 169168400 mask 
ffffffff mult 2684354560 shift 26

Note the smaller shift values. Not too long ago the shift calculation 
was adjusted to allow for longer periods between interrupts,  so I 
suspect you're on an older kernel.

Further, using your debug patch on my system, it was well beyond 10 
minutes before the debug overflow occurred.  And similarly I couldn't 
trip the watchdog trigger using sysrq-t (but again, only two threads 
here, so not nearly as much data to print as you have).

Could you verify that the issue you're seeing is still is present w/ 
current mainline?  Please don't take this as me dismissing your 
problem!  As I mentioned earlier there are some known issues w/ the 
clocksource watchdog code. But I want to narrow down if you're  problem  
is currently present in mainline or only in older kernels, as that will 
help us find the proper fix.

thanks
-john

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/