lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 10 Oct 2017 07:03:36 -0700
From:   "tj@...nel.org" <tj@...nel.org>
To:     Trond Myklebust <trondmy@...marydata.com>
Cc:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "lorenzo.pieralisi@....com" <lorenzo.pieralisi@....com>,
        "linux-nfs@...r.kernel.org" <linux-nfs@...r.kernel.org>,
        "jiangshanlai@...il.com" <jiangshanlai@...il.com>,
        "bfields@...ldses.org" <bfields@...ldses.org>,
        "anna.schumaker@...app.com" <anna.schumaker@...app.com>,
        "jlayton@...chiereds.net" <jlayton@...chiereds.net>
Subject: Re: net/sunrpc: v4.14-rc4 lockdep warning

Hello, Trond.

On Mon, Oct 09, 2017 at 06:32:13PM +0000, Trond Myklebust wrote:
> On Mon, 2017-10-09 at 19:17 +0100, Lorenzo Pieralisi wrote:
> > I have run into the lockdep warning below while running v4.14-rc3/rc4
> > on an ARM64 defconfig Juno dev board - reporting it to check whether
> > it is a known/genuine issue.
> > 
> > Please let me know if you need further debug data or need some
> > specific tests.
> > 
> > [    6.209384] ======================================================
> > [    6.215569] WARNING: possible circular locking dependency detected
> > [    6.221755] 4.14.0-rc4 #54 Not tainted
> > [    6.225503] ------------------------------------------------------
> > [    6.231689] kworker/4:0H/32 is trying to acquire lock:
> > [    6.236830]  ((&task->u.tk_work)){+.+.}, at: [<ffff0000080e64cc>]
> > process_one_work+0x1cc/0x3f0
> > [    6.245472] 
> >                but task is already holding lock:
> > [    6.251309]  ("xprtiod"){+.+.}, at: [<ffff0000080e64cc>]
> > process_one_work+0x1cc/0x3f0
> > [    6.259158] 
> >                which lock already depends on the new lock.
> > 
> > [    6.267345] 
> >                the existing dependency chain (in reverse order) is:
..
> Adding Tejun and Lai, since this looks like a workqueue locking issue.

It looks a bit cryptic but it's warning against the following case.

1. Memory pressure is high and rescuer kicks in for the xprtiod
   workqueue.  There are no other kworkers serving the workqueue.

2. The rescuer runs the xptr_destroy path and ends up calling
   cancel_work_sync() on a work item which is queued on xprtiod.

3. The work item is pending on the same workqueue and assuming that
   memory pressure doesn't let off (let's say reclaim is trying to
   kick off nfs pages), the only way it can get executed is by the
   rescuer which is waiting for the work item - an A-B-A deadlock.

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ