lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <4E2EB304-EE09-424C-9939-48A9BE1C539A@oracle.com>
Date:   Tue, 1 Dec 2020 11:07:17 -0500
From:   Chuck Lever <chuck.lever@...cle.com>
To:     Yi Wang <wang.yi59@....com.cn>
Cc:     Trond Myklebust <trond.myklebust@...merspace.com>,
        Anna Schumaker <anna.schumaker@...app.com>,
        Bruce Fields <bfields@...ldses.org>,
        Linux NFS Mailing List <linux-nfs@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        xue.zhihong@....com.cn, wang.liang82@....com.cn,
        Cheng Lin <cheng.lin130@....com.cn>
Subject: Re: [PATCH] nfs_common: need lock during iterate through the list

Hello!

> On Dec 1, 2020, at 7:06 AM, Yi Wang <wang.yi59@....com.cn> wrote:
> 
> From: Cheng Lin <cheng.lin130@....com.cn> 
> 
> If the elem is deleted during be iterated on it, the iteration
> process will fall into an endless loop.
> 
> kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [nfsd:17137]
> 
> PID: 17137  TASK: ffff8818d93c0000  CPU: 4   COMMAND: "nfsd" 
>     [exception RIP: __state_in_grace+76]
>     RIP: ffffffffc00e817c  RSP: ffff8818d3aefc98  RFLAGS: 00000246
>     RAX: ffff881dc0c38298  RBX: ffffffff81b03580  RCX: ffff881dc02c9f50
>     RDX: ffff881e3fce8500  RSI: 0000000000000001  RDI: ffffffff81b03580
>     RBP: ffff8818d3aefca0   R8: 0000000000000020   R9: ffff8818d3aefd40
>     R10: ffff88017fc03800  R11: ffff8818e83933c0  R12: ffff8818d3aefd40
>     R13: 0000000000000000  R14: ffff8818e8391068  R15: ffff8818fa6e4000
>     CS: 0010  SS: 0018
>  #0 [ffff8818d3aefc98] opens_in_grace at ffffffffc00e81e3 [grace]
>  #1 [ffff8818d3aefca8] nfs4_preprocess_stateid_op at ffffffffc02a3e6c [nfsd]
>  #2 [ffff8818d3aefd18] nfsd4_write at ffffffffc028ed5b [nfsd]
>  #3 [ffff8818d3aefd80] nfsd4_proc_compound at ffffffffc0290a0d [nfsd]
>  #4 [ffff8818d3aefdd0] nfsd_dispatch at ffffffffc027b800 [nfsd]
>  #5 [ffff8818d3aefe08] svc_process_common at ffffffffc02017f3 [sunrpc]
>  #6 [ffff8818d3aefe70] svc_process at ffffffffc0201ce3 [sunrpc]
>  #7 [ffff8818d3aefe98] nfsd at ffffffffc027b117 [nfsd]
>  #8 [ffff8818d3aefec8] kthread at ffffffff810b88c1
>  #9 [ffff8818d3aeff50] ret_from_fork at ffffffff816d1607
> 
> The troublemake elem:
> crash> lock_manager ffff881dc0c38298
> struct lock_manager {
>   list = {
>     next = 0xffff881dc0c38298,
>     prev = 0xffff881dc0c38298
>   },
>   block_opens = false
> }
> 
> Signed-off-by: Cheng Lin <cheng.lin130@....com.cn> 
> Signed-off-by: Yi Wang <wang.yi59@....com.cn> 
> ---
>  fs/nfs_common/grace.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/nfs_common/grace.c b/fs/nfs_common/grace.c
> index b73d9dd37..26f2a50ec 100644
> --- a/fs/nfs_common/grace.c
> +++ b/fs/nfs_common/grace.c
> @@ -69,10 +69,14 @@ __state_in_grace(struct net *net, bool open)
>      if (!open)
>          return !list_empty(grace_list);
>   
> +    spin_lock(&grace_lock);
>      list_for_each_entry(lm, grace_list, list) {
> -        if (lm->block_opens)
> +        if (lm->block_opens) {
> +            spin_unlock(&grace_lock);
>              return true;
> +        }
>      }
> +    spin_unlock(&grace_lock);
>      return false;
>  }
>   
> --

This looks most closely related to NFSD, so I've applied it to
my NFSD tree for the next merge window. I've also added

Fixes: c87fb4a378f9 ("lockd: NLM grace period shouldn't block NFSv4 opens")

You can find it in the cel-next topic branch in my kernel repo:

git://git.linux-nfs.org/projects/cel/cel-2.6.git

Incidentally, the e-mail encoding mangled the white space and I
don't see the e-mail showing up on lore.kernel.org. I applied it
by hand since it was small, but this should be addressed for
future patches so our patch handling infrastructure can deal
properly with your submissions. Thanks!


--
Chuck Lever



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ