lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 27 Nov 2014 17:56:37 -0500
From:	Dave Jones <davej@...hat.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Linux Kernel <linux-kernel@...r.kernel.org>,
	the arch/x86 maintainers <x86@...nel.org>,
	Don Zickus <dzickus@...hat.com>
Subject: Re: frequent lockups in 3.18rc4

On Thu, Nov 27, 2014 at 11:17:16AM -0800, Linus Torvalds wrote:
 > On Wed, Nov 26, 2014 at 2:57 PM, Dave Jones <davej@...hat.com> wrote:
 > >
 > > So 3.17 also has this problem.
 > > Good news I guess in that it's not a regression, but damn I really didn't
 > > want to have to go digging through the mists of time to find the last 'good' point.
 > 
 > So I'm looking at the watchdog code, and it seems racy wrt parking and startup.
 > 
 > In particular, it sets the high priority *after* starting the hrtimer,
 > and it goes back to SCHED_NORMAL *before* canceling the timer.
 > 
 > Which seems completely ass-backwards. And the smp_hotplug_thread stuff
 > explicitly enables preemption around the setup/cleanup/part/unpark
 > operations.
 > 
 > However, that would be an issue only if trinity might be doing things
 > that enable and disable the watchdog. And doing so under insane loads.
 > Even then it seems unlikely.
 > 
 > The insane loads you have. But even then, could a load average of 169
 > possibly delay running a non-RT process for 22 seconds? Doubtful.
 > 
 > But just in case: do you do cpu hotplug events (that will disable and
 > re-enable the watchdog process?).  Anything else that will part/unpark
 > the hotplug thread?

That's root-only iirc, and I'm not running trinity as root, so that
shouldn't be happening. There's also no sign of such behaviour in dmesg
when the problem occurs.

 > Quite frankly, I'm just grasping for straws here, but a lot of the
 > watchdog traces really have seemed spurious...

Agreed.

Currently leaving 3.16 running. 21hrs so far.

	Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists