[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190430040043.GH23075@ZenIV.linux.org.uk>
Date: Tue, 30 Apr 2019 05:00:43 +0100
From: Al Viro <viro@...iv.linux.org.uk>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>
Subject: Re: [RFC][PATCHSET] sorting out RCU-delayed stuff in
->destroy_inode()
On Mon, Apr 29, 2019 at 08:37:29PM -0700, Linus Torvalds wrote:
> On Mon, Apr 29, 2019, 20:09 Al Viro <viro@...iv.linux.org.uk> wrote:
>
> >
> > ... except that this callback can (and always could) get executed after
> > freeing struct super_block.
> >
>
> Ugh.
>
> That food looks nasty. Shouldn't the super block freeing wait for the
> filesystem to be all done instead? Do a rcu synchronization or something?
>
> Adding that pointer looks really wrong to me. I'd much rather delay the sb
> freeing. Is there some reason that can't be done that I'm missing?
Where would you put that synchronize_rcu()? Doing that before ->put_super()
is too early - inode references might be dropped in there. OTOH, doing
that after that point means that while struct super_block itself will be
there, any number of data structures hanging from it might be not.
So we are still very limited in what we can do inside ->free_inode()
instance *and* we get bunch of synchronize_rcu() for no good reason.
Note that for normal lockless accesses (lockless ->d_revalidate(), ->d_hash(),
etc.) we are just fine with having struct super_block freeing RCU-delayed
(along with any data structures we might need) - the superblock had
been seen at some point after we'd taken rcu_read_lock(), so its
freeing won't happen until we drop it. So we don't need synchronize_rcu()
for that.
Here the problem is that we are dealing with another RCU callback;
synchronize_rcu() would be needed for it, but it will only protect that
intermediate dereference of ->i_sb; any rcu-delayed stuff scheduled
from inside ->put_super() would not be ordered wrt ->free_inode().
And if we are doing that just for the sake of that one dereference,
we might as well do it before scheduling i_callback().
PS: we *are* guaranteed that module will still be there (unregister_filesystem()
does synchronize_rcu() and rcu_barrier() is done before kmem_cache_destroy()
in assorted exit_foo_fs()).
Powered by blists - more mailing lists