linux-kernel - Re: [patch 1/6] fs: icache RCU free inodes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <AANLkTimJCgsPqB9ihbScr1RdZ+XGk2tq7LZNfh109Skv@mail.gmail.com>
Date:	Tue, 9 Nov 2010 09:08:17 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Nick Piggin <npiggin@...nel.dk>, Al Viro <viro@...iv.linux.org.uk>,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [patch 1/6] fs: icache RCU free inodes

On Tue, Nov 9, 2010 at 8:21 AM, Eric Dumazet <eric.dumazet@...il.com> wrote:
>
> You can see problems using this fancy thing :
>
> - Need to use slab ctor() to not overwrite some sensitive fields of
> reused inodes.
>  (spinlock, next pointer)

Yes, the downside of using SLAB_DESTROY_BY_RCU is that you really
cannot initialize some fields in the allocation path, because they may
end up being still used while allocating a new (well, re-used) entry.

However, I think that in the long run we pretty much _have_ to do that
anyway, because the "free each inode separately with RCU" is a real
overhead (Nick reports 10-20% cost). So it just makes my skin crawl to
go that way. And I think SLAB_DESTROY_BY_RCU is the "normal" way to do
these kinds of things anyway, so I actually think it's "simpler", if
only because it's the common pattern.

(Put another way: it might not be less code, and it might have its own
subtle issues, but they are _shared_ subtle issues with all the other
SLAB_DESTROY_BY_RCU users, so we hopefully have a better understanding
of them)

> - Fancy algo to detect an inode moved from one chain to another. Lookups
> should be able to detect and restart their loop.

So this is where I think we should just use locks unless we have hard
numbers to say that being clever is worth it.

I do realize that some loads look up inodes directly, but at the same
time I really think that we should absolutely target the whole "RCU
path lookup" first. And that one has no inode lookup at all, it's just
a dentry->d_inode pointer derefeence.

So let's not mix in NFSD loads into the discussion yet - it's a
separate thing, and if we want to make that whole code use RCU later,
that's fine. But let's really keep it "later", because it's not
_nearly_ as important as the path walking.

> - After a match, need to get a stable reference on inode (lock), then
> recheck the keys to make sure the target inode is the right one.

Again, this is only an issue for non-dentry lookup. For the dentry
case, we know that if the dentry still exists, then the inode still
exists. So we don't need to check a stable inode pointer if we just
verify the stability of the dentry - and we'll have to do that anyway
obviously.

So I really think that the dentry lookup is the thing that should
primarily drive this. And that will not in any way preclude us from
looking at the non-dentry case _later_, and worrying about the details
there at some later date.

In other words: let's bite off the complexity in small chunks. Let's
keep the inode lock approach for now for the actual inode lists and
hash lookups. I think they are almost entirely independent issues from
the dentry path.

                     Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/