lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 6 Jan 2009 08:40:40 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Ingo Molnar <mingo@...e.hu>
cc:	Peter Zijlstra <peterz@...radead.org>,
	Matthew Wilcox <matthew@....cx>,
	Andi Kleen <andi@...stfloor.org>,
	Chris Mason <chris.mason@...cle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	linux-btrfs <linux-btrfs@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Steven Rostedt <rostedt@...dmis.org>,
	Gregory Haskins <ghaskins@...ell.com>,
	Nick Piggin <npiggin@...e.de>
Subject: Re: [PATCH][RFC]: mutex: adaptive spin



On Tue, 6 Jan 2009, Linus Torvalds wrote:
> 
> So it should be renamed. Something like "task_is_oncpu()" or whatever.

Another complaint, which is tangentially related in that it actually 
concerns "current".

Right now, if some process deadlocks on a mutex, we get hung process, but 
with a nice backtrace and hopefully other things (that don't need that 
lock) still continue to work.

But if I read it correctly, the adaptive spin code will instead just hang. 
Exactly because "task_is_current()" will also trigger for that case, and 
now you get an infinite loop, with the process spinning until it looses 
its own CPU, which obviously will never happen.

Yes, this is the behavior we get with spinlocks too, and yes, lock 
debugging will talk about it, but it's a regression. We've historically 
had a _lot_ more bad deadlocks on mutexes than we have had on spinlocks, 
exactly because mutexes can be held over much more complex code. So 
regressing on it and making it less debuggable is bad.

IOW, if we do this, then I think we need a

	BUG_ON(task == owner);

in the waiting slow-path. I realize the test already exists for the DEBUG 
case, but I think we just want it even for production kernels. Especially 
since we'd only ever need it in the slow-path.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ