lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 01 Jun 2022 16:28:49 +0200
From:   Christian Schoenebeck <linux_oss@...debyte.com>
To:     Tyler Hicks <tyhicks@...ux.microsoft.com>
Cc:     Eric Van Hensbergen <ericvh@...il.com>,
        Latchesar Ionkov <lucho@...kov.net>,
        Dominique Martinet <asmadeus@...ewreck.org>,
        Jianyong Wu <jianyong.wu@....com>,
        v9fs-developer@...ts.sourceforge.net, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/5] 9p: Fix refcounting during full path walks for fid lookups

On Dienstag, 31. Mai 2022 16:28:29 CEST Tyler Hicks wrote:
> On 2022-05-30 19:14:43, Christian Schoenebeck wrote:
> > On Freitag, 27. Mai 2022 01:59:59 CEST Tyler Hicks wrote:
> > > Decrement the refcount of the parent dentry's fid after walking
> > > each path component during a full path walk for a lookup. Failure to do
> > > so can lead to fids that are not clunked until the filesystem is
> > > 
> > > unmounted, as indicated by this warning:
> > >  9pnet: found fid 3 not clunked
> > 
> > That explains why I saw so many fids not being clunked with recent Linux
> > kernel versions while doing some 9p protocol debugging with QEMU recently.
> 
> In addition to this refcounting bug, there's another one that I noticed
> while running fstests. My series does not fix it and I haven't had a
> chance to look into it more. The generic/531 test triggers it.
> 
>  https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/tree/tests/generic/5
> 31
> > > The improper refcounting after walking resulted in open(2) returning
> > > -EIO on any directories underneath the mount point when using the virtio
> > > transport. When using the fd transport, there's no apparent issue until
> > > the filesytem is unmounted and the warning above is emitted to the logs.
> > 
> > Actually I never saw that open() = -EIO error. Do you have a reproducer?
> 
> The reproducer that I have is binary only (fairly large and runs a bunch
> of different tests) and is used to regression test the Windows Subsystem
> for Linux 2 (WSL2) host <-> guest filesystem sharing. Now that I think
> about it, I'm not sure if the open() = -EIO error happens with other 9p
> servers.
> 
> I can try to tease out the exact sequence of filesystem operations from
> this test binary but it might take me a bit. It looks like it has to do
> with switching UIDs, which could make sense because different users may
> not be connected to the filesystem yet (the conditional block that does
> p9_client_attach() and v9fs_fid_add()).
> 
> > > In some cases, the user may not yet be attached to the filesystem and a
> > > new root fid, associated with the user, is created and attached to the
> > > root dentry before the full path walk is performed. Increment the new
> > > root fid's refcount to two in that situation so that it can be safely
> > > decremented to one after it is used for the walk operation. The new fid
> > > will still be attached to the root dentry when
> > > v9fs_fid_lookup_with_uid() returns so a final refcount of one is
> > > correct/expected.
> > > 
> > > Fixes: 6636b6dcc3db ("9p: add refcount to p9_fid struct")
> > > Cc: stable@...r.kernel.org
> > > Signed-off-by: Tyler Hicks <tyhicks@...ux.microsoft.com>
> > > ---
> > > 
> > >  fs/9p/fid.c | 17 +++++------------
> > >  1 file changed, 5 insertions(+), 12 deletions(-)
> > > 
> > > diff --git a/fs/9p/fid.c b/fs/9p/fid.c
> > > index 79df61fe0e59..5a469b79c1ee 100644
> > > --- a/fs/9p/fid.c
> > > +++ b/fs/9p/fid.c
> > > @@ -152,7 +152,7 @@ static struct p9_fid
> > > *v9fs_fid_lookup_with_uid(struct
> > > dentry *dentry, const unsigned char **wnames, *uname;
> > > 
> > >  	int i, n, l, clone, access;
> > >  	struct v9fs_session_info *v9ses;
> > > 
> > > -	struct p9_fid *fid, *old_fid = NULL;
> > > +	struct p9_fid *fid, *old_fid;
> > > 
> > >  	v9ses = v9fs_dentry2v9ses(dentry);
> > >  	access = v9ses->flags & V9FS_ACCESS_MASK;
> > > 
> > > @@ -194,13 +194,12 @@ static struct p9_fid
> > > *v9fs_fid_lookup_with_uid(struct
> > > dentry *dentry, if (IS_ERR(fid))
> > > 
> > >  			return fid;
> > > 
> > > +		refcount_inc(&fid->count);
> > > 
> > >  		v9fs_fid_add(dentry->d_sb->s_root, fid);
> > >  	
> > >  	}
> > >  	/* If we are root ourself just return that */
> > > 
> > > -	if (dentry->d_sb->s_root == dentry) {
> > > -		refcount_inc(&fid->count);
> > > +	if (dentry->d_sb->s_root == dentry)
> > > 
> > >  		return fid;
> > > 
> > > -	}
> > 
> > Hmm, wouldn't it then be possible that the root fid is returned with
> > refcount being 2 here?
> 
> Yes and I think that's correct. One refcount taken for adding the root
> fid to the root dentry and another refcount taken for the original
> purpose of the lookup.
> 
> Reverting this portion of the change and re-testing with the reproducer
> triggers a refcount underflow.

Right, I still have some knowledge gaps in the kernel's 9p code base. I was 
actually rather confused about p9_client_clunk() which I just realized 
actually does the refcount decrement and then conditionally sends out the 
'Tclunk' message on refcount zero only.

So yes, it looks fine to me:

Reviewed-by: Christian Schoenebeck <linux_oss@...debyte.com>

> > >  	/*
> > >  	
> > >  	 * Do a multipath walk with attached root.
> > >  	 * When walking parent we need to make sure we
> > > 
> > > @@ -212,6 +211,7 @@ static struct p9_fid
> > > *v9fs_fid_lookup_with_uid(struct
> > > dentry *dentry, fid = ERR_PTR(n);
> > > 
> > >  		goto err_out;
> > >  	
> > >  	}
> > > 
> > > +	old_fid = fid;
> > > 
> > >  	clone = 1;
> > >  	i = 0;
> > >  	while (i < n) {
> > > 
> > > @@ -221,15 +221,8 @@ static struct p9_fid
> > > *v9fs_fid_lookup_with_uid(struct
> > > dentry *dentry, * walk to ensure none of the patch component change
> > > 
> > >  		 */
> > >  		
> > >  		fid = p9_client_walk(fid, l, &wnames[i], clone);
> > > 
> > > +		p9_client_clunk(old_fid);
> > > 
> > >  		if (IS_ERR(fid)) {
> > > 
> > > -			if (old_fid) {
> > > -				/*
> > > -				 * If we fail, clunk fid which are
> > 
> > mapping
> > 
> > > -				 * to path component and not the last
> > 
> > component
> > 
> > > -				 * of the path.
> > > -				 */
> > > -				p9_client_clunk(old_fid);
> > > -			}
> > > 
> > >  			kfree(wnames);
> > >  			goto err_out;
> > >  		
> > >  		}
> > 
> > So this is the actual fix mentioned in the commit log. Makes sense.
> 
> I think the refcount_inc() change for the root fid is an important and
> required part of the fix.
> 
> > Nitpicking: Wouldn't it be a bit cleaner to set old_fid solely within the
> > while loop and just before overwriting fid? And as we now have bumped to
> > -std=C11, probably making old_fid a local variable within loop scope only?
> 
> You're right that it would be cleaner for the purposes of this single
> patch. In a followup patch in this series, I start tracking the root fid
> with a root_fid variable and that requires "old_fid = root_fid" before
> we enter the loop and then "old_fid = fid" inside of the loop.

s/while/for/ would do the trick I guess. Not a big deal though.

Best regards,
Christian Schoenebeck


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ