linux-kernel - Re: [PATCH][RFC]: mutex: adaptive spin

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.00.0901060833060.3057@localhost.localdomain>
Date:	Tue, 6 Jan 2009 08:40:40 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Ingo Molnar <mingo@...e.hu>
cc:	Peter Zijlstra <peterz@...radead.org>,
	Matthew Wilcox <matthew@....cx>,
	Andi Kleen <andi@...stfloor.org>,
	Chris Mason <chris.mason@...cle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	linux-btrfs <linux-btrfs@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Steven Rostedt <rostedt@...dmis.org>,
	Gregory Haskins <ghaskins@...ell.com>,
	Nick Piggin <npiggin@...e.de>
Subject: Re: [PATCH][RFC]: mutex: adaptive spin

On Tue, 6 Jan 2009, Linus Torvalds wrote:
> 
> So it should be renamed. Something like "task_is_oncpu()" or whatever.

Another complaint, which is tangentially related in that it actually 
concerns "current".

Right now, if some process deadlocks on a mutex, we get hung process, but 
with a nice backtrace and hopefully other things (that don't need that 
lock) still continue to work.

But if I read it correctly, the adaptive spin code will instead just hang. 
Exactly because "task_is_current()" will also trigger for that case, and 
now you get an infinite loop, with the process spinning until it looses 
its own CPU, which obviously will never happen.

Yes, this is the behavior we get with spinlocks too, and yes, lock 
debugging will talk about it, but it's a regression. We've historically 
had a _lot_ more bad deadlocks on mutexes than we have had on spinlocks, 
exactly because mutexes can be held over much more complex code. So 
regressing on it and making it less debuggable is bad.

IOW, if we do this, then I think we need a

	BUG_ON(task == owner);

in the waiting slow-path. I realize the test already exists for the DEBUG 
case, but I think we just want it even for production kernels. Especially 
since we'd only ever need it in the slow-path.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/