lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20160916202404.GE26371@htj.duckdns.org>
Date:   Fri, 16 Sep 2016 16:24:04 -0400
From:   Tejun Heo <tj@...nel.org>
To:     Jiri Slaby <jslaby@...e.cz>
Cc:     Dmitry Vyukov <dvyukov@...gle.com>,
        Marcel Holtmann <marcel@...tmann.org>,
        Gustavo Padovan <gustavo@...ovan.org>,
        Johan Hedberg <johan.hedberg@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        linux-bluetooth <linux-bluetooth@...r.kernel.org>,
        netdev <netdev@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        syzkaller <syzkaller@...glegroups.com>,
        Kostya Serebryany <kcc@...gle.com>,
        Alexander Potapenko <glider@...gle.com>,
        Eric Dumazet <edumazet@...gle.com>,
        Takashi Iwai <tiwai@...e.com>
Subject: Re: net/bluetooth: workqueue destruction WARNING in
 hci_unregister_dev

Hello,

On Tue, Sep 13, 2016 at 08:14:40PM +0200, Jiri Slaby wrote:
> I assume Dmitry sees the same what I am still seeing, so I reported this
> some time ago:
> https://lkml.org/lkml/2016/3/21/492
> 
> This warning is trigerred there and still occurs with "HEAD":
>   (pwq != wq->dfl_pwq) && (pwq->refcnt > 1)
> and the state dump is in the log empty too:
> destroy_workqueue: name='hci0' pwq=ffff88006b5c8f00
> wq->dfl_pwq=ffff88006b5c9b00 pwq->refcnt=2 pwq->nr_active=0 delayed_works:
>   pwq 13:
>  cpus=2-3 node=1 flags=0x4 nice=-20 active=0/1
>     in-flight: 2669:wq_barrier_func

Hmmm... I think it could be from rescuer holding reference on the pwq.
Both cases have WQ_MEM_RECLAIM and it could be that rescuer was still
in flight (even without work items pending) when the sanity checks
were done.  The following patch moves the sanity checks after rescuer
destruction.  Dmitry, Jiri, can you please see whether the warning
goes away with this patch?

Thanks.

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 984f6ff..e8046a1 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4042,8 +4042,7 @@ void destroy_workqueue(struct workqueue_struct *wq)
 			}
 		}
 
-		if (WARN_ON((pwq != wq->dfl_pwq) && (pwq->refcnt > 1)) ||
-		    WARN_ON(pwq->nr_active) ||
+		if (WARN_ON(pwq->nr_active) ||
 		    WARN_ON(!list_empty(&pwq->delayed_works))) {
 			mutex_unlock(&wq->mutex);
 			show_workqueue_state();
@@ -4080,6 +4079,7 @@ void destroy_workqueue(struct workqueue_struct *wq)
 		for_each_node(node) {
 			pwq = rcu_access_pointer(wq->numa_pwq_tbl[node]);
 			RCU_INIT_POINTER(wq->numa_pwq_tbl[node], NULL);
+			WARN_ON((pwq != wq->dfl_pwq) && (pwq->refcnt != 1));
 			put_pwq_unlocked(pwq);
 		}
 
@@ -4089,6 +4089,7 @@ void destroy_workqueue(struct workqueue_struct *wq)
 		 */
 		pwq = wq->dfl_pwq;
 		wq->dfl_pwq = NULL;
+		WARN_ON(pwq->refcnt != 1);
 		put_pwq_unlocked(pwq);
 	}
 }

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ