lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1425670188.2475.113.camel@j-VirtualBox>
Date:	Fri, 06 Mar 2015 11:29:48 -0800
From:	Jason Low <jason.low2@...com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Davidlohr Bueso <dave@...olabs.net>,
	Ingo Molnar <mingo@...nel.org>,
	Sasha Levin <sasha.levin@...cle.com>,
	Peter Zijlstra <peterz@...radead.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Dave Jones <davej@...emonkey.org.uk>, jason.low2@...com
Subject: Re: sched: softlockups in multi_cpu_stop

On Fri, 2015-03-06 at 11:05 -0800, Linus Torvalds wrote:
> On Fri, Mar 6, 2015 at 10:57 AM, Jason Low <jason.low2@...com> wrote:
> >
> > Right, the can_spin_on_owner() was originally added to the mutex
> > spinning code for optimization purposes, particularly so that we can
> > avoid adding the spinner to the OSQ only to find that it doesn't need to
> > spin. This function needing to return a correct value should really only
> > affect performance, so yes, lockups due to this seems surprising.
> 
> Well, softlockups aren't about "correct behavior". They are about
> certain things not happening in a timely manner.
> 
> Clearly the mutex code now tries to hold on to the CPU too aggressively.
> 
> At some point people need to admit that busy-looping isn't always a
> good idea. Especially if
> 
>  (a) we could idle the core instead
> 
>  (b) the tuning has been done based on som especial-purpose benchmark
> that is likely not realistic
> 
>  (c) we get reports from people that it causes problems.
> 
> In other words: Let's just undo that excessive busy-looping. The
> performance numbers were dubious to begin with. Real scalability comes
> from fixing the locking, not from trying to play games with the locks
> themselves. Particularly games that then cause problems.

Hi Linus,

Agreed, this is an issue we need to address, though we're just trying to
figure out if the change to rwsem_can_spin_on_owner() in "commit:
37e9562453b" is really the one that's causing the issue.

For example, it looks like Ming recently found another change in the
same patchset: commit b3fd4f03ca0b995(locking/rwsem: Avoid deceiving
lock spinners) to be causing lockups.

https://lkml.org/lkml/2015/3/6/521

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ