[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <197b15028aa942d2812b1746aff453c4e791aa00.camel@kernel.org>
Date: Wed, 18 Jun 2025 07:50:33 -0400
From: Jeff Layton <jlayton@...nel.org>
To: chenxiaosong@...nxiaosong.com, chuck.lever@...cle.com, neilb@...e.de,
okorniev@...hat.com, Dai.Ngo@...cle.com, tom@...pey.com
Cc: linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org,
huhai@...inos.cn, ChenXiaoSong <chenxiaosong@...inos.cn>
Subject: Re: [RFC PATCH] nfsd: convert the nfsd_users to atomic_t
On Wed, 2025-06-18 at 18:41 +0800, chenxiaosong@...nxiaosong.com wrote:
> From: ChenXiaoSong <chenxiaosong@...inos.cn>
>
> Before commit 38f080f3cd19 ("NFSD: Move callback_wq into struct nfs4_client"),
> we had a null-ptr-deref in nfsd4_probe_callback() (Link[1]):
>
> nfsd: last server has exited, flushing export cache
> NFSD: starting 90-second grace period (net f0000030)
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> ...
> Call trace:
> __queue_work+0xb4/0x558
> queue_work_on+0x88/0x90
> nfsd4_probe_callback+0x4c/0x58 [nfsd]
> NFSD: starting 90-second grace period (net f0000030)
> nfsd4_probe_callback_sync+0x20/0x38 [nfsd]
> nfsd4_init_conn.isra.57+0x8c/0xa8 [nfsd]
> nfsd4_create_session+0x5b8/0x718 [nfsd]
> nfsd4_proc_compound+0x4c0/0x710 [nfsd]
> nfsd_dispatch+0x104/0x248 [nfsd]
> svc_process_common+0x348/0x808 [sunrpc]
> svc_process+0xb0/0xc8 [sunrpc]
> nfsd+0xf0/0x160 [nfsd]
> kthread+0x134/0x138
> ret_from_fork+0x10/0x18
> Code: aa1c03e0 97ffffba aa0003e2 b5000780 (f9400262)
> SMP: stopping secondary CPUs
> Starting crashdump kernel...
> Bye!
>
> One of the cases is:
>
> task A (cpu 1) | task B (cpu 2) | task C (cpu 3)
> ---------------------|----------------------|---------------------------------
> nfsd_startup_generic | nfsd_startup_generic |
> nfsd_users == 0 | nfsd_users == 0 |
> nfsd_users++ | nfsd_users++ |
> nfsd_users == 1 | |
> ... | |
> callback_wq == xxx | |
> ---------------------|----------------------|---------------------------------
> | | nfsd_shutdown_generic
> | | nfsd_users == 1
> | | --nfsd_users
> | | nfsd_users == 0
> | | ...
> | | callback_wq == xxx
> | | destroy_workqueue(callback_wq)
> ---------------------|----------------------|---------------------------------
> | nfsd_users == 1 |
> | ... |
> | callback_wq == yyy |
>
> After commit 38f080f3cd19 ("NFSD: Move callback_wq into struct nfs4_client"),
> this issue no longer occurs, but we should still convert the nfsd_users
> to atomic_t to prevent other similar issues.
>
> Link[1]: https://chenxiaosong.com/en/nfs/en-null-ptr-deref-in-nfsd4_probe_callback.html
> Co-developed-by: huhai <huhai@...inos.cn>
> Signed-off-by: huhai <huhai@...inos.cn>
> Signed-off-by: ChenXiaoSong <chenxiaosong@...inos.cn>
> ---
> fs/nfsd/nfssvc.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> index 9b3d6cff0e1e..08b1f9ebdc2a 100644
> --- a/fs/nfsd/nfssvc.c
> +++ b/fs/nfsd/nfssvc.c
> @@ -270,13 +270,13 @@ static int nfsd_init_socks(struct net *net, const struct cred *cred)
> return 0;
> }
>
> -static int nfsd_users = 0;
> +static atomic_t nfsd_users = ATOMIC_INIT(0);
>
> static int nfsd_startup_generic(void)
> {
> int ret;
>
> - if (nfsd_users++)
> + if (atomic_fetch_inc(&nfsd_users))
> return 0;
>
> ret = nfsd_file_cache_init();
> @@ -291,13 +291,13 @@ static int nfsd_startup_generic(void)
> out_file_cache:
> nfsd_file_cache_shutdown();
> dec_users:
> - nfsd_users--;
> + atomic_dec(&nfsd_users);
> return ret;
> }
>
> static void nfsd_shutdown_generic(void)
> {
> - if (--nfsd_users)
> + if (atomic_dec_return(&nfsd_users))
> return;
>
> nfs4_state_shutdown();
Isn't nfsd_users protected by the nfsd_mutex? It looks like it's held
in all of the places this counter is accessed.
--
Jeff Layton <jlayton@...nel.org>
Powered by blists - more mailing lists