linux-kernel - Re: [rfc] "fair" rw spinlocks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <m1y6lg7q3n.fsf@fess.ebiederm.org>
Date:	Sat, 05 Dec 2009 19:12:28 -0800
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...e.hu>,
	Christoph Hellwig <hch@...radead.org>,
	Nick Piggin <npiggin@...e.de>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Oleg Nesterov <oleg@...hat.com>
Subject: Re: [rfc] "fair" rw spinlocks

Linus Torvalds <torvalds@...ux-foundation.org> writes:

> On Mon, 30 Nov 2009, Thomas Gleixner wrote:
>> 
>> I'm aware of that. The number of places where we read_lock
>> tasklist_lock is 79 in 36 files right now. That's not a horrible task
>> to go through them one by one and do a case by case conversion with a
>> proper changelog. That would only leave the write_lock sites. 
>
> The write_lock sites should be fine, since just changing them to a 
> spinlock should be 100% semantically equivalent - except for the lack of 
> interrupt disable. And the lack of interrupt disable will result in a nice 
> big deadlock if some interrupt really does take the spinlock, which is 
> much easier to debug than a subtle race that would get the wrong read 
> value.
>
>> We can then either do the rw_lock to spin_lock conversion or keep the
>> rw_lock which has no readers anymore and behaves like a spinlock for a
>> transition time so reverts of one of the read_lock -> rcu patches
>> could be done to debug stuff.
>
> So as per the above, I wouldn't worry about the write lockers. Might as 
> well change it to a spinlock, since that's what it will act as. It's not 
> as if there is any chance that the spinlock code is subtly buggy.
>
> So the only reason to keep it as a rwlock would be if you decide to do the 
> read-locked cases one by one, and don't end up with all of them converted. 
> Which is a reasonable strategy too, of course. We don't _have_ to convert 
> them all - if the main problem is some starvation issue, it's sufficient 
> to convert just the main read-lock cases so that writers never get 
> starved.
>
> But converting it all would be nice, because that whole
>
> 	write_lock_irq(&tasklist_lock);
>
> to
>
> 	spin_lock(&tasklist_lock);
>
> conversion would likely be a measurable performance win. Both because 
> spinlocks are fundamentally faster (no atomic on unlock), and because you 
> get rid of the irq disable/enable. But in order to get there, you'd have 
> to convert _all_ the read-lockers, so you'd miss the opportunity to only 
> convert the easy cases.


Atomically sending signal to every member of a process group, is the
big fly in the ointment I am aware of.  Last time I looked I could
not see how to convert it rcu.

Fundamentally: "kill -KILL -pgrp" should be usable to kill all of
the processes in a process group, and "kill -KILL -1" should be usable
to kill everything except the sender and init.  Something I have seen
in shutdown scripts on more than one occasion.

This is a subtle in the sense that it won't show up in simple tests if
you get it wrong.

This is a pain because we occasionally signal a process group from
interrupt context.

The trouble as I recall is how to ensure new processes see the signal.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/