linux-kernel - Re: [PATCH] bcache: Remove use of down/up_read_non

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130821021900.GB4051@kmo-pixel>
Date:	Tue, 20 Aug 2013 19:19:00 -0700
From:	Kent Overstreet <kmo@...erainc.com>
To:	Steven Rostedt <rostedt@...dmis.org>
Cc:	LKML <linux-kernel@...r.kernel.org>, linux-bcache@...r.kernel.org,
	dm-devel@...hat.com, Christoph Hellwig <hch@...radead.org>,
	David Howells <dhowells@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	akpm@...ux-foundation.org
Subject: Re: [PATCH] bcache: Remove use of down/up_read_non_owner()

On Tue, Aug 20, 2013 at 11:16:02AM -0400, Steven Rostedt wrote:
> 
> The down/up_read_non_owner() is a nasty hack in the API of the rwsem
> operations. It was once removed, but then resurrected for use with
> bcache. Not only is the API an abomination to the rwsem API, it also
> prevents bcache from ever being compiled with PREEMPT_RT, as the RT
> kernel requires all rwsems to have an owner.
> 
> Instead of keeping the down/up_read_non_owner() around, it is better to
> modify the one user of it and have it do something a bit differently.
> 
> From reading the bcache code, the reason for the non_owner usage is
> that a request is made, and writers must wait for that request to
> finish before they can continue. But because the request is completed
> by another task, the rwsem can not be held for read and then released
> on completion.
> 
> Instead of abusing the rwsem code for this, I added a refcount
> "nr_requests" to the cached_dev structure as well as a "write_waiters"
> wait queue. When a request is to be made, the rwsem is still taken for
> read, but this time with an owner. The refcount is incremented and
> before exiting the function, the rwsem is released.
> 
> The writer will then take the rwsem for write, check the refcount, if
> it is not zero, it will release the rwsem, add itself to a wait_event()
> waiting for refcount to become zero, and then try again.

I _really_ disagree with this approach.

I get that there's a problem, but the bcache code REALLY IS USING THE
RWSEM AS A LOCK; the answer isn't to open code the lock!

Apologies to Christoph for getting distracted and not responding when
you started to explain what the issues were for RT. I'm not really
convinced they're that insurmountable (one of the issues was debugging,
which the _non_owner() stuff always handled just fine), but I'll take it
on faith that this usage is incompatible with rwsems + the RT
functionality since I haven't actually had time to dig into it.

So assuming that's the case, IMO the sanest thing to do is make a new
type of lock - "rwsem_non_process" or somesuch - and use _that_ in
bcache. Not open coding the lock.

It can even live in the bcache code if we want since there currently
wouldn't be any other users, I don't really care. But open coding it?
Come on... makes me wonder what other code in the kernel is open coding
locks because it couldn't release it in the same process context that
took the lock for whatever reason.

Also, nack this patch because increasing the number of atomic ops to
shared cachelines in our fast path. If it does end up being open coded,
I'll make a more efficient version.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/