lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <197b15028aa942d2812b1746aff453c4e791aa00.camel@kernel.org>
Date: Wed, 18 Jun 2025 07:50:33 -0400
From: Jeff Layton <jlayton@...nel.org>
To: chenxiaosong@...nxiaosong.com, chuck.lever@...cle.com, neilb@...e.de, 
	okorniev@...hat.com, Dai.Ngo@...cle.com, tom@...pey.com
Cc: linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org,
 huhai@...inos.cn,  ChenXiaoSong <chenxiaosong@...inos.cn>
Subject: Re: [RFC PATCH] nfsd: convert the nfsd_users to atomic_t

On Wed, 2025-06-18 at 18:41 +0800, chenxiaosong@...nxiaosong.com wrote:
> From: ChenXiaoSong <chenxiaosong@...inos.cn>
> 
> Before commit 38f080f3cd19 ("NFSD: Move callback_wq into struct nfs4_client"),
> we had a null-ptr-deref in nfsd4_probe_callback() (Link[1]):
> 
>  nfsd: last server has exited, flushing export cache
>  NFSD: starting 90-second grace period (net f0000030)
>  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
>  ...
>  Call trace:
>   __queue_work+0xb4/0x558
>   queue_work_on+0x88/0x90
>   nfsd4_probe_callback+0x4c/0x58 [nfsd]
>  NFSD: starting 90-second grace period (net f0000030)
>   nfsd4_probe_callback_sync+0x20/0x38 [nfsd]
>   nfsd4_init_conn.isra.57+0x8c/0xa8 [nfsd]
>   nfsd4_create_session+0x5b8/0x718 [nfsd]
>   nfsd4_proc_compound+0x4c0/0x710 [nfsd]
>   nfsd_dispatch+0x104/0x248 [nfsd]
>   svc_process_common+0x348/0x808 [sunrpc]
>   svc_process+0xb0/0xc8 [sunrpc]
>   nfsd+0xf0/0x160 [nfsd]
>   kthread+0x134/0x138
>   ret_from_fork+0x10/0x18
>  Code: aa1c03e0 97ffffba aa0003e2 b5000780 (f9400262)
>  SMP: stopping secondary CPUs
>  Starting crashdump kernel...
>  Bye!
> 
> One of the cases is:
> 
>     task A (cpu 1)    |   task B (cpu 2)     |   task C (cpu 3)
>  ---------------------|----------------------|---------------------------------
>  nfsd_startup_generic | nfsd_startup_generic |
>    nfsd_users == 0    |  nfsd_users == 0     |
>    nfsd_users++       |  nfsd_users++        |
>    nfsd_users == 1    |                      |
>    ...                |                      |
>    callback_wq == xxx |                      |
>  ---------------------|----------------------|---------------------------------
>                       |                      | nfsd_shutdown_generic
>                       |                      |   nfsd_users == 1
>                       |                      |   --nfsd_users
>                       |                      |   nfsd_users == 0
>                       |                      |   ...
>                       |                      |   callback_wq == xxx
>                       |                      |   destroy_workqueue(callback_wq)
>  ---------------------|----------------------|---------------------------------
>                       |  nfsd_users == 1     |
>                       |  ...                 |
>                       |  callback_wq == yyy  |
> 
> After commit 38f080f3cd19 ("NFSD: Move callback_wq into struct nfs4_client"),
> this issue no longer occurs, but we should still convert the nfsd_users
> to atomic_t to prevent other similar issues.
> 
> Link[1]: https://chenxiaosong.com/en/nfs/en-null-ptr-deref-in-nfsd4_probe_callback.html
> Co-developed-by: huhai <huhai@...inos.cn>
> Signed-off-by: huhai <huhai@...inos.cn>
> Signed-off-by: ChenXiaoSong <chenxiaosong@...inos.cn>
> ---
>  fs/nfsd/nfssvc.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> index 9b3d6cff0e1e..08b1f9ebdc2a 100644
> --- a/fs/nfsd/nfssvc.c
> +++ b/fs/nfsd/nfssvc.c
> @@ -270,13 +270,13 @@ static int nfsd_init_socks(struct net *net, const struct cred *cred)
>  	return 0;
>  }
>  
> -static int nfsd_users = 0;
> +static atomic_t nfsd_users = ATOMIC_INIT(0);
>  
>  static int nfsd_startup_generic(void)
>  {
>  	int ret;
>  
> -	if (nfsd_users++)
> +	if (atomic_fetch_inc(&nfsd_users))
>  		return 0;
>  
>  	ret = nfsd_file_cache_init();
> @@ -291,13 +291,13 @@ static int nfsd_startup_generic(void)
>  out_file_cache:
>  	nfsd_file_cache_shutdown();
>  dec_users:
> -	nfsd_users--;
> +	atomic_dec(&nfsd_users);
>  	return ret;
>  }
>  
>  static void nfsd_shutdown_generic(void)
>  {
> -	if (--nfsd_users)
> +	if (atomic_dec_return(&nfsd_users))
>  		return;
>  
>  	nfs4_state_shutdown();

Isn't nfsd_users protected by the nfsd_mutex? It looks like it's held
in all of the places this counter is accessed.

-- 
Jeff Layton <jlayton@...nel.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ