lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 20 Jan 2016 09:59:32 -0800
From:	John Stultz <john.stultz@...aro.org>
To:	Thomas Gleixner <tglx@...utronix.de>
Cc:	Jeff Merkey <linux.mdb@...il.com>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [BUG REPORT] ktime_get_ts64 causes Hard Lockup

On Wed, Jan 20, 2016 at 9:42 AM, Thomas Gleixner <tglx@...utronix.de> wrote:
> On Wed, 20 Jan 2016, John Stultz wrote:
>> Ehrm.  A more productive route in solving this might be to cap the
>> cycle delta we return from timekeeping_get_delta().
>>
>> We already do this in the CONFIG_DEBUG_TIMEKEEPING, but adding a
>> simple check it to the non-debug case should be doable w/o adding too
>> much overhead to this very hot path.
>>
>> Something like:
>> if (delta > tkr->clock->max_cycles)
>>     delta = tkr->clock->max_cycles;
>>
>> return delta;
>
> Well, you can make CONFIG_KDB select CONFIG_DEBUG_TIMEKEEPING.

True.  And turning on DEBUG_TIMEKEEPING is probably the easiest thing
for Jeff to try.

Though, there's still the same issue w/ paused VMs. Most of the design
for the timekeeping code has been that it can't properly function if
you block update_wall_time() calls, but it shouldn't kill the box.
With most clocksources, the issue is the counter wraps and we lose
time. But in this case with the TSC its the *very* large cycle delta
turning into a unexpectedly large nanosecond value.

Hrm.. I do also wonder: the logarithmic accumulation chews through
large cycle deltas efficiently, but it does have some design limits,
so it  might also hit the rails and take awhile to spin accumulating
time with such large offsets.

Jeff: Can you try the config option above to let me know if that
avoids the issue? And if not, can you provide some analysis of what
else is going on?

thanks
-john

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ