[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <83baef7ddeaf9d60885933683eeaff8511eff10e.camel@themaw.net>
Date: Mon, 14 Jun 2021 15:16:51 +0800
From: Ian Kent <raven@...maw.net>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Tejun Heo <tj@...nel.org>, Eric Sandeen <sandeen@...deen.net>,
Fox Chen <foxhlchen@...il.com>,
Brice Goglin <brice.goglin@...il.com>,
Rick Lindsley <ricklind@...ux.vnet.ibm.com>,
David Howells <dhowells@...hat.com>,
Miklos Szeredi <miklos@...redi.hu>,
Marcelo Tosatti <mtosatti@...hat.com>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Carlos Maiolino <cmaiolino@...hat.com>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v6 5/7] kernfs: use i_lock to protect concurrent inode
updates
On Mon, 2021-06-14 at 14:52 +0800, Ian Kent wrote:
> On Mon, 2021-06-14 at 09:32 +0800, Ian Kent wrote:
> > On Sat, 2021-06-12 at 01:45 +0000, Al Viro wrote:
> > > On Wed, Jun 09, 2021 at 04:51:22PM +0800, Ian Kent wrote:
> > > > The inode operations .permission() and .getattr() use the
> > > > kernfs
> > > > node
> > > > write lock but all that's needed is to keep the rb tree stable
> > > > while
> > > > updating the inode attributes as well as protecting the update
> > > > itself
> > > > against concurrent changes.
> > >
> > > Huh? Where does it access the rbtree at all? Confused...
> > >
> > > > diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
> > > > index 3b01e9e61f14e..6728ecd81eb37 100644
> > > > --- a/fs/kernfs/inode.c
> > > > +++ b/fs/kernfs/inode.c
> > > > @@ -172,6 +172,7 @@ static void kernfs_refresh_inode(struct
> > > > kernfs_node *kn, struct inode *inode)
> > > > {
> > > > struct kernfs_iattrs *attrs = kn->iattr;
> > > >
> > > > + spin_lock(&inode->i_lock);
> > > > inode->i_mode = kn->mode;
> > > > if (attrs)
> > > > /*
> > > > @@ -182,6 +183,7 @@ static void kernfs_refresh_inode(struct
> > > > kernfs_node *kn, struct inode *inode)
> > > >
> > > > if (kernfs_type(kn) == KERNFS_DIR)
> > > > set_nlink(inode, kn->dir.subdirs + 2);
> > > > + spin_unlock(&inode->i_lock);
> > > > }
> > >
> > > Even more so - just what are you serializing here? That code
> > > synchronizes inode
> > > metadata with those in kernfs_node. Suppose you've got two
> > > threads
> > > doing
> > > ->permission(); the first one gets through kernfs_refresh_inode()
> > > and
> > > goes into
> > > generic_permission(). No locks are held, so
> > > kernfs_refresh_inode()
> > > from another
> > > thread can run in parallel with generic_permission().
> > >
> > > If that's not a problem, why two kernfs_refresh_inode() done in
> > > parallel would
> > > be a problem?
> > >
> > > Thread 1:
> > > permission
> > > done refresh, all locks released now
> > > Thread 2:
> > > change metadata in kernfs_node
> > > Thread 2:
> > > permission
> > > goes into refresh, copying metadata into inode
> > > Thread 1:
> > > generic_permission()
> > > No locks in common between the last two operations, so
> > > we generic_permission() might see partially updated metadata.
> > > Either we don't give a fuck (in which case I don't understand
> > > what purpose does that ->i_lock serve) *or* we need the exclusion
> > > to cover a wider area.
> >
> > This didn't occur to me, obviously.
> >
> > It seems to me this can happen with the original code too although
> > using a mutex might reduce the likelihood of it happening.
> >
> > Still ->permission() is meant to be a read-only function so the VFS
> > shouldn't need to care about it.
> >
> > Do you have any suggestions on how to handle this.
> > Perhaps the only way is to ensure the inode is updated only in
> > functions that are expected to do this.
>
> IIRC Greg and Tejun weren't averse to adding a field to the
> struct kernfs_iattrs, but there were concerns about increasing
> memory usage.
>
> Because of this I think the best way to handle this would be to
> broaden the scope of the i_lock to cover the generic calls in
> kernfs_iop_getattr() and kernfs_iop_permission(). The only other
> call to kernfs_refresh_inode() is at inode initialization and
> then only for I_NEW inodes so that should be ok. Also both
> generic_permission() and generic_fillattr() are reading from the
> inode so not likely to need to take the i_lock any time soon (is
> this a reasonable assumption Al?).
>
> Do you think this is a sensible way to go Al?
Unless of course we don't care about taking a lock here at all,
Greg, Tejun?
Ian
Powered by blists - more mailing lists