linux-kernel - Re: [PATCH 2/2] nfs: handle failure of get_nfs_open

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3452e3fea7c250fc0816aa1b55f579849e056283.camel@kernel.org>
Date: Mon, 21 Apr 2025 08:00:18 -0400
From: Jeff Layton <jlayton@...nel.org>
To: Li Lingfeng <lilingfeng3@...wei.com>, trondmy@...nel.org,
 anna@...nel.org, 	bcodding@...hat.com
Cc: linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org, 
	yukuai1@...weicloud.com, houtao1@...wei.com, yi.zhang@...wei.com, 
	yangerkun@...wei.com, lilingfeng@...weicloud.com
Subject: Re: [PATCH 2/2] nfs: handle failure of get_nfs_open_context

On Mon, 2025-04-21 at 09:56 +0800, Li Lingfeng wrote:
> 在 2025/4/19 20:34, Jeff Layton 写道:
> > On Sat, 2025-04-19 at 16:53 +0800, Li Lingfeng wrote:
> > > During initialization of unlockdata or lockdata structures, if acquiring
> > > the nfs_open_context fails, the current operation must be aborted to
> > > ensure the nfs_open_context remains valid after initialization completes.
> > > This is critical because both lock and unlock release callbacks
> > > dereference the nfs_open_context - an invalid context could lead to null
> > > pointer dereference.
> > > 
> > > Fixes: faf5f49c2d9c ("NFSv4: Make NFS clean up byte range locks asynchronously")
> > > Fixes: a5d16a4d090b ("NFSv4: Convert LOCK rpc call into an asynchronous RPC call")
> > > Signed-off-by: Li Lingfeng <lilingfeng3@...wei.com>
> > > ---
> > >   fs/nfs/nfs4proc.c | 22 +++++++++++++++-------
> > >   1 file changed, 15 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > > index 9f5689c43a50..d76cf0f79f9c 100644
> > > --- a/fs/nfs/nfs4proc.c
> > > +++ b/fs/nfs/nfs4proc.c
> > > @@ -7075,24 +7075,27 @@ static struct nfs4_unlockdata *nfs4_alloc_unlockdata(struct file_lock *fl,
> > >   	struct nfs4_state *state = lsp->ls_state;
> > >   	struct inode *inode = state->inode;
> > >   	struct nfs_lock_context *l_ctx;
> > > +	struct nfs_open_context *open_ctx;
> > >   
> > >   	p = kzalloc(sizeof(*p), GFP_KERNEL);
> > >   	if (p == NULL)
> > >   		return NULL;
> > >   	l_ctx = nfs_get_lock_context(ctx);
> > > -	if (!IS_ERR(l_ctx)) {
> > > +	if (!IS_ERR(l_ctx))
> > >   		p->l_ctx = l_ctx;
> > > -	} else {
> > > -		kfree(p);
> > > -		return NULL;
> > > -	}
> > > +	else
> > > +		goto out_free;
> > > +	/* Ensure we don't close file until we're done freeing locks! */
> > > +	open_ctx = get_nfs_open_context(ctx);
> > > 
> > > 
> > Sorry for the confusion. Now that I look more closely, I think I was
> > wrong before.
> > 
> > This can't fail, because the caller holds a reference to ctx, so the
> > refcount must be non-zero. Instead of this patch, could you add a
> > comment in there to that effect to make this clear in the future?
> Hi Jeff,
> 
> Thank you for the feedback.
> Adding a comment instead of this patch may be better.
> 
> However, I’d like to seek your guidance on a broader question: For
> scenarios where an error condition currently cannot occur but would lead
> to severe consequences (e.g., NULL pointer dereference, data corruption)
> if it ever did happen (e.g., due to future code changes or bugs), do you
> recommend proactively adding error handling as a defensive measure?
> 
> My rationale:
> Current code: No code path triggers this condition today --> Handling
> code would be "dead" for now.
> Future risks: If a bug introduced later allows the condition to occur,
> silent failure or crashes could result.
> Is there a kernel/dev policy on such preemptive safeguards? Or should we
> address these only when the triggering scenarios materialize?
> 
> Your insight would help me align with the project’s practices.
> Thanks in advance!
> 

There is no firm policy here. We just have to make a judgment call in
these situations.

In general, we don't want to litter the code with a lot of conditionals
or BUG_ONs/WARN_ONs for cases that can really never happen, as that
might slow things down for little benefit, and it makes the code less
readable. OTOH, being proactive about catching errors is a good thing,
so if there is any chance that things could change in the future, it's
good to have a warning about it.

In this particular case, given that we have to hold a reference in
order to pass a pointer to the ctx in the first place, there is little
value in doing (e.g.) WARN_ON(!open_ctx), as that should really never
happen.



> Best regards,
> Lingfeng
> > 
> > 
> > > +	if (open_ctx)
> > > +		p->ctx = open_ctx;
> > > +	else
> > > +		goto out_free;
> > If we did decide to keep the error handling however, this would leak
> > l_ctx. That reference would also need to be put if open_ctx was NULL
> > here.
> > 
> > >   	p->arg.fh = NFS_FH(inode);
> > >   	p->arg.fl = &p->fl;
> > >   	p->arg.seqid = seqid;
> > >   	p->res.seqid = seqid;
> > >   	p->lsp = lsp;
> > > -	/* Ensure we don't close file until we're done freeing locks! */
> > > -	p->ctx = get_nfs_open_context(ctx);
> > >   	locks_init_lock(&p->fl);
> > >   	locks_copy_lock(&p->fl, fl);
> > >   	p->server = NFS_SERVER(inode);
> > > @@ -7100,6 +7103,9 @@ static struct nfs4_unlockdata *nfs4_alloc_unlockdata(struct file_lock *fl,
> > >   	nfs4_stateid_copy(&p->arg.stateid, &lsp->ls_stateid);
> > >   	spin_unlock(&state->state_lock);
> > >   	return p;
> > > +out_free:
> > > +	kfree(p);
> > > +	return NULL;
> > >   }
> > >   
> > >   static void nfs4_locku_release_calldata(void *data)
> > > @@ -7327,6 +7333,8 @@ static struct nfs4_lockdata *nfs4_alloc_lockdata(struct file_lock *fl,
> > >   	p->lsp = lsp;
> > >   	p->server = server;
> > >   	p->ctx = get_nfs_open_context(ctx);
> > > +	if (!p->ctx)
> > > +		goto out_free_seqid;
> > >   	locks_init_lock(&p->fl);
> > >   	locks_copy_lock(&p->fl, fl);
> > >   	return p;

-- 
Jeff Layton <jlayton@...nel.org>