lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4FF28CB1.7020304@us.ibm.com>
Date:	Mon, 02 Jul 2012 23:09:53 -0700
From:	John Stultz <johnstul@...ibm.com>
To:	John Stultz <johnstul@...ibm.com>
CC:	Linux Kernel <linux-kernel@...r.kernel.org>,
	Prarit Bhargava <prarit@...hat.com>, stable@...r.kernel.org,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH 0/3][RFC] Potential fix for leapsecond caused futex issue
 (v3)

On 07/02/2012 07:16 PM, John Stultz wrote:
> NOTE: Some reports have been of a hard hang right at or before
> the leapsecond. I've not been able to reproduce or diagnose
> this, so this fix does not likely address the reported hard
> hangs (unless they end up being connected to the futex/hrtimer
> issue). Please email lkml and me if you experienced this.

Since as noted above, I've seen some sporadic reports of hard hangs. 
Some seem connected to the hrtimer problem, where ksoftirq seems to go 
crazy and cause nmi watchdog lockups, but others are less clear.

I wanted to try to provide a way to stress both the kernel's leapsecond 
code as well as provide a way for folks to be able to test their 
application's robustness in the face of leapsecond inconsistencies.

Attached is my first attempt at such a test.

It is designed to be run on a server, where it will schedule a 
leapsecond every day at midnight GMT.  So every day, while it runs, the 
server will see a leapsecond.  This allows the the leap second, as well 
as any suspected timer related lockups that might happen when the 
leapsecond is scheduled to be stressed.

The test also outputs time samples right before, during and after the 
leapsecond is applied, so you can watch it happen.

Also since once a day is a fairly low frequency, if you pass a "-s" to 
the test, it will jump the system time forward to 10 seconds right 
before the scheduled leapsecond for that day. Allowing a leapsecond to 
occur every ~13 seconds. This mode may cause application disruption, as 
it also causes the system to advance a day every ~13 seconds.

The test additionally will note if it observes the hrtimer early 
expiration problem that was widely seen over the weekend.

Hopefully this will provide a mechanism to test and maintain the 
kernel's correct behaviour for these rare events, as well as allowing 
folks to get more comfortable with leapsecond behaviour and test how it 
might impact their applications.

If anyone who observed a hard hang is able to use this to reproduce the 
problem, I'd greatly like to hear about it.

Build instructions are in the test file.

thanks
-john

View attachment "leap-a-day.c" of type "text/x-csrc" (5226 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ