[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z03fpnNYHjuKox0E@tissot.1015granger.net>
Date: Mon, 2 Dec 2024 11:26:14 -0500
From: Chuck Lever <chuck.lever@...cle.com>
To: lilingfeng@...weicloud.com
Cc: "zhangjian (CG)" <zhangjian496@...wei.com>,
Li Lingfeng <lilingfeng@...weicloud.com>, jlayton@...nel.org,
neilb@...e.de, okorniev@...hat.com, kolga@...app.com,
Dai.Ngo@...cle.com, tom@...pey.com, Trond.Myklebust@...app.com,
linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org,
yukuai1@...weicloud.com, houtao1@...wei.com, yi.zhang@...wei.com,
yangerkun@...wei.com, lilingfeng3@...wei.com
Subject: Re: [PATCH] nfsd: set acl_access/acl_default after getting successful
On Wed, Nov 27, 2024 at 07:37:42PM -0800, Rick Macklem wrote:
> On Wed, Nov 27, 2024 at 7:14 PM Rick Macklem <rick.macklem@...il.com> wrote:
> >
> > On Wed, Nov 27, 2024 at 5:18 PM zhangjian (CG) <zhangjian496@...wei.com> wrote:
> > >
> > > there is one case when disk error cause get_inode_acl(inode,
> > > ACL_TYPE_DEFAULT) return EIO,
> > > resp->acl_access will not be null. posix_acl_release(resp->acl_default)
> > > will trigger this warning.
> > >
> > >
> > > > If getting acl_default fails, acl_access and acl_default will be released
> > > > simultaneously. However, acl_access will still retain a pointer pointing
> > > > to the released posix_acl, which will trigger a WARNING in
> > > > nfs3svc_release_getacl like this:
> > > >
> > > > ------------[ cut here ]------------
> > > > refcount_t: underflow; use-after-free.
> > > > WARNING: CPU: 26 PID: 3199 at lib/refcount.c:28
> > > > refcount_warn_saturate+0xb5/0x170
> > > > Modules linked in:
> > > > CPU: 26 UID: 0 PID: 3199 Comm: nfsd Not tainted
> > > > 6.12.0-rc6-00079-g04ae226af01f-dirty #8
> > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > > > 1.16.1-2.fc37 04/01/2014
> > > > RIP: 0010:refcount_warn_saturate+0xb5/0x170
> > > > Code: cc cc 0f b6 1d b3 20 a5 03 80 fb 01 0f 87 65 48 d8 00 83 e3 01 75
> > > > e4 48 c7 c7 c0 3b 9b 85 c6 05 97 20 a5 03 01 e8 fb 3e 30 ff <0f> 0b eb
> > > > cd 0f b6 1d 8a3
> > > > RSP: 0018:ffffc90008637cd8 EFLAGS: 00010282
> > > > RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83904fde
> > > > RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88871ed36380
> > > > RBP: ffff888158beeb40 R08: 0000000000000001 R09: fffff520010c6f56
> > > > R10: ffffc90008637ab7 R11: 0000000000000001 R12: 0000000000000001
> > > > R13: ffff888140e77400 R14: ffff888140e77408 R15: ffffffff858b42c0
> > > > FS: 0000000000000000(0000) GS:ffff88871ed00000(0000)
> > > > knlGS:0000000000000000
> > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > CR2: 0000562384d32158 CR3: 000000055cc6a000 CR4: 00000000000006f0
> > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > Call Trace:
> > > > <TASK>
> > > > ? refcount_warn_saturate+0xb5/0x170
> > > > ? __warn+0xa5/0x140
> > > > ? refcount_warn_saturate+0xb5/0x170
> > > > ? report_bug+0x1b1/0x1e0
> > > > ? handle_bug+0x53/0xa0
> > > > ? exc_invalid_op+0x17/0x40
> > > > ? asm_exc_invalid_op+0x1a/0x20
> > > > ? tick_nohz_tick_stopped+0x1e/0x40
> > > > ? refcount_warn_saturate+0xb5/0x170
> > > > ? refcount_warn_saturate+0xb5/0x170
> > > > nfs3svc_release_getacl+0xc9/0xe0
> > > > svc_process_common+0x5db/0xb60
> > > > ? __pfx_svc_process_common+0x10/0x10
> > > > ? __rcu_read_unlock+0x69/0xa0
> > > > ? __pfx_nfsd_dispatch+0x10/0x10
> > > > ? svc_xprt_received+0xa1/0x120
> > > > ? xdr_init_decode+0x11d/0x190
> > > > svc_process+0x2a7/0x330
> > > > svc_handle_xprt+0x69d/0x940
> > > > svc_recv+0x180/0x2d0
> > > > nfsd+0x168/0x200
> > > > ? __pfx_nfsd+0x10/0x10
> > > > kthread+0x1a2/0x1e0
> > > > ? kthread+0xf4/0x1e0
> > > > ? __pfx_kthread+0x10/0x10
> > > > ret_from_fork+0x34/0x60
> > > > ? __pfx_kthread+0x10/0x10
> > > > ret_from_fork_asm+0x1a/0x30
> > > > </TASK>
> > > > Kernel panic - not syncing: kernel: panic_on_warn set ...
> > > >
> > > > Clear acl_access/acl_default first and set both of them only when both
> > > > ACLs are successfully obtained.
> > > >
> > > > Fixes: a257cdd0e217 ("[PATCH] NFSD: Add server support for NFSv3 ACLs.")
> > > > Signed-off-by: Li Lingfeng <lilingfeng@...weicloud.com>
> > > > ---
> > > > fs/nfsd/nfs3acl.c | 14 ++++++++------
> > > > 1 file changed, 8 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/fs/nfsd/nfs3acl.c b/fs/nfsd/nfs3acl.c
> > > > index 5e34e98db969..17579a032a5b 100644
> > > > --- a/fs/nfsd/nfs3acl.c
> > > > +++ b/fs/nfsd/nfs3acl.c
> > > > @@ -29,10 +29,12 @@ static __be32 nfsd3_proc_getacl(struct svc_rqst *rqstp)
> > > > {
> > > > struct nfsd3_getaclargs *argp = rqstp->rq_argp;
> > > > struct nfsd3_getaclres *resp = rqstp->rq_resp;
> > > > - struct posix_acl *acl;
> > > > + struct posix_acl *acl = NULL, *dacl = NULL;
> > > > struct inode *inode;
> > > > svc_fh *fh;
> > > >
> > > > + resp->acl_access = NULL;
> > > > + resp->acl_default = NULL;
> > (A) These two lines fix the bug, without other changes needed, I think...
> Oops, I was wrong w.r.t. this. These two lines need to be repeated after the
> posix_acl_relase() calls under "fail:".
> > > > fh = fh_copy(&resp->fh, &argp->fh);
> > > > resp->status = fh_verify(rqstp, &resp->fh, 0, NFSD_MAY_NOP);
> > > > if (resp->status != nfs_ok)
> > > > @@ -56,19 +58,19 @@ static __be32 nfsd3_proc_getacl(struct svc_rqst *rqstp)
> > > > resp->status = nfserrno(PTR_ERR(acl));
> > > > goto fail;
> > > > }
> > > > - resp->acl_access = acl;
> > Because you deleted this line...
> > > > }
> > > > if (resp->mask & (NFS_DFACL|NFS_DFACLCNT)) {
> > > > /* Check how Solaris handles requests for the Default ACL
> > > > of a non-directory! */
> > > > - acl = get_inode_acl(inode, ACL_TYPE_DEFAULT);
> > > > - if (IS_ERR(acl)) {
> > > > - resp->status = nfserrno(PTR_ERR(acl));
> > > > + dacl = get_inode_acl(inode, ACL_TYPE_DEFAULT);
> > > > + if (IS_ERR(dacl)) {
> > > > + resp->status = nfserrno(PTR_ERR(dacl));
> > > > goto fail;
> > The goto fail here will not release the access acl, if I read the code
> > correctly.
> > > > }
> > > > - resp->acl_default = acl;
> > > > }
> > > >
> > > > + resp->acl_access = acl;
> > > > + resp->acl_default = dacl;
> > > > /* resp->acl_{access,default} are released in nfs3svc_release_getacl. */
> > > > out:
> > > > return rpc_success;
> > >
> > Actually, all that is needed to fix the bug is adding the two lines
> > that initialize
> > them both NUL, I think.. I marked that change (A) above.
> Nope, I was wrong w.r.t. this part. You either need to set
> resp->acl_access = acl;
> resp->acl_default = dacl;
> after the posix_acl_relase() calls or switch to using the local
> acl and dacl variables for these posix_acl_release calls and stick
> with what you did above w.r.t. resp->acl_access and resp->acl_default.
>
> Anyhow, I think the case I noted above where get_inode_acl(inode,
> ACL_TYPE_DEFAULT)
> fails will not release acl with your patch.
>
> rick
Howdy -
This one didn't make it into v6.13 because there are some
outstanding (and ambiguous, at least to me) review comments. Can you
address the comments, update the patch, and post it again?
--
Chuck Lever
Powered by blists - more mailing lists