lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1396290426.5261.80.camel@marge.simpson.net>
Date:	Mon, 31 Mar 2014 20:27:06 +0200
From:	Mike Galbraith <umgwanakikbuti@...il.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...e.hu>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sched: update_rq_clock() must skip ONE update

On Mon, 2014-03-31 at 09:13 -0700, Linus Torvalds wrote: 
> On Sun, Mar 30, 2014 at 9:20 PM, Mike Galbraith
> <umgwanakikbuti@...il.com> wrote:
> >
> > Point of being verbose was to make sure it was clear exactly how this
> > harmless little bug selectively kills large IO boxen..
> 
> My point is that if you want it to be applied hours before I make a
> release, I need to be made aware of how critical it is.

Oh, I didn't cc you because I wanted it applied instantly as ultra
critical, only because the chain of events might be of interest.

It takes a lot of cycles to add up to NMI.  Those cycles exist with or
without the throttle being fooled into picking on watchdog.  How bad can
wakeup latency get with modprobe mptsas?  So bad that you don't even
need this little bug to _further_ incapacitate the watchdog?  Can the
wakeup latency do the job all by itself?  It's wakeup latency that is
being improperly attributed to watchdog in the trace data.

(then there's "is watchdog being subject to throttle a good idea")

> The data/commentary in the commit message made *zero* sense to me in
> that regards. It was just noise.

One of my sisters says I speak Martian, she may be right.  Looks clear
to me, but then I did the tracing, condensed the output and hastily
wrote the apparently useless words.. perhaps a tad too hastily.

I haven't yet received confirmation that this is the fix, so there may
be more to it, this only a part.  A huge interrupt hit at the right time
and no irq accounting enabled could properly trigger the throttle.. but
it'd be difficult to reliably hit such thin targets on multiple CPUs.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ