linux-kernel - Re: Inode Lock Scalability V7 (was V6)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20101022044657.GA6899@amd>
Date:	Fri, 22 Oct 2010 15:46:57 +1100
From:	Nick Piggin <npiggin@...nel.dk>
To:	Al Viro <viro@...IV.linux.org.uk>
Cc:	Nick Piggin <npiggin@...nel.dk>,
	Dave Chinner <david@...morbit.com>,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: Inode Lock Scalability V7 (was V6)

On Fri, Oct 22, 2010 at 04:07:28AM +0100, Al Viro wrote:
> On Fri, Oct 22, 2010 at 01:34:44PM +1100, Nick Piggin wrote:
> 
> > > 	* walkers of the sb, wb and hash lists can grab ->i_lock at will;
> > > it nests inside their locks.
> > 
> > What about if it is going on or off multiple data structures while
> > the inode is live, like inode_lock can protect today. Such as putting
> > it on the hash and sb list.
> 
> Look at the code.  You are overengineering it.  We do *not* need a framework
> for messing with these lists in arbitrary ways.  Where would we need to
> do that to an inode we don't hold a reference to or had placed I_FREEING

Look, my point is that I believe it is an easier step to get from
inode_lock to i_lock, and then from there we can go wild.

What is your criteria for a particular lock ordering being "natural"
versus not? In almost all cases we have

[stuff with data structure] -> [stuff with inode]
and
[stuff with inode] -> [stuff with data structure]

So neither is inherently more natural, I think. So it comes down to
how the code fits together and works.

The difficulty with inode_lock breaking is not the data structures.
We know how to lock and modify them. The hardest part is verifying
that a particular inode has no new, unhandled concurrency introduced
to it (eg. the particular concurrency you pointed out in Dave's patch
just now). Agree?

So in that case, I think it is much more natural to be able to take
an inode and with i_lock, cover it from all icache state concurrency.
I object to it being called over engineering -- it's actually just a
big hammer on the inode, compared with fiddling with more complex
rules.

> on and would need i_lock held by caller?  Even assuming that we need to
> keep [present in hash, present on sb list] in sync (which I seriously doubt),
> we can bloody well grab both locks before i_lock.

I'm not saying there is. Most of the problems would be between a
particular inode state versus its membership on one of the lists.
However, with my patches, I *don't care* if there is an issue there
or not. It simply doesn't matter because it has the same protection
as inode_lock at that point.

If you want to micro optimise it, change lock orders around, and
open more concurrency, that is easily possible to do after my patches
lift inode_lock. If you do all the changes *before* inode_lock removal,
then it's not bisectable at all.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/