[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130930194921.GS13318@ZenIV.linux.org.uk>
Date: Mon, 30 Sep 2013 20:49:22 +0100
From: Al Viro <viro@...IV.linux.org.uk>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Miklos Szeredi <miklos@...redi.hu>
Subject: Re: [rfc][possible solution] RCU vfsmounts
On Sun, Sep 29, 2013 at 07:10:47PM +0100, Al Viro wrote:
> FWIW, right now I'm reviewing the subset of fs code that can be hit in
> RCU mode.
OK... AFAICS, we are not too far from being able to handle RCU pathwalk
straying into fs in the middle of being shut down.
* There are 5 methods that can be called:
->d_hash(...)
->d_compare(...)
->d_revalidate(..., LOOKUP_RCU | ...)
->d_manage(..., true)
->permission(..., MAY_NOT_BLOCK | MAY_EXEC)
Filesystem needs to be able to survive those during shutdown. The stuff
needed for that should _not_ be freed without synchronize_rcu() (or via
call_rcu()); usually ->s_fs_info is involved (when anything is needed
at all). In any case, we shouldn't allow rmmod without making sure that
everything in RCU mode has run out, but most of the filesystems have
rcu_barrier() in their exit_module anyway.
* __put_super() probably ought to delay actual freeing via
call_rcu(); might not be strictly necessary, but probably a good idea
anyway.
* shrink_dcache_for_umount() ought to use d_walk(), a-la
shrink_dcache_parent().
Note that most of the filesystems don't have any of these methods or
don't look at anything outside of inode/dentry involved in RCU case.
Zoo:
* adfs: has the name length limit in fs-private part of superblock; used
by ->d_hash() and ->d_compare(). No other methods involved, synchronize_rcu()
before doing kfree() in adfs_put_super() will suffice.
* autofs4: wants fs-private part of superblock in ->d_manage().
synchronize_rcu() in autofs4_kill_sb() would do it, or we could delay
freeing that sucker via call_rcu() (in that case we want delayed
freeing in __put_super() as well).
* btrfs: wants btrfs_root_readonly(BTRFS_I(inode)->root) usable in
->permission(). Delayed freeing of struct btrfs_root, perhaps?
* cifs: wants nls, refered to from fs-private part of superblock.
->permission() wants fs-private part of superblock as well. Just
synchronize_rcu() before unload_nls() in cifs_umount()...
* fat: same situation as with cifs
* fuse: delayed freeing of struct fuse_conn? BTW, Miklos, just what is
} else if (mask & (MAY_ACCESS | MAY_CHDIR)) {
if (mask & MAY_NOT_BLOCK)
return -ECHILD;
about, when we never pass such combinations? Oh, well...
* hpfs: similar to cifs and fat, only without use of nls (a homegrown table
of some sort).
* ncpfs: _probably_ similar to cifs et.al., but there might be dragons
* procfs: delayed freeing of pid_namespace?
* lustre: messy, haven't looked through that.
Overall, it looks doable.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists