lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <CACT4Y+b+9Ubsbfs58nHAraG+7iXRBkrn=XyCTPDJLike7w-mjQ@mail.gmail.com> Date: Tue, 22 Mar 2016 13:32:09 +0100 From: Dmitry Vyukov <dvyukov@...gle.com> To: Jiri Slaby <jslaby@...e.cz> Cc: Tejun Heo <tj@...nel.org>, Marcel Holtmann <marcel@...tmann.org>, Gustavo Padovan <gustavo@...ovan.org>, Johan Hedberg <johan.hedberg@...il.com>, "David S. Miller" <davem@...emloft.net>, linux-bluetooth <linux-bluetooth@...r.kernel.org>, netdev <netdev@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>, syzkaller <syzkaller@...glegroups.com>, Kostya Serebryany <kcc@...gle.com>, Alexander Potapenko <glider@...gle.com>, Sasha Levin <sasha.levin@...cle.com>, Eric Dumazet <edumazet@...gle.com>, Takashi Iwai <tiwai@...e.com> Subject: Re: net/bluetooth: workqueue destruction WARNING in hci_unregister_dev On Tue, Mar 22, 2016 at 9:09 AM, Jiri Slaby <jslaby@...e.cz> wrote: > On 03/21/2016, 04:58 PM, Jiri Slaby wrote: >> Hello, >> >> On 03/18/2016, 09:52 PM, Tejun Heo wrote: >>> On Thu, Mar 17, 2016 at 01:00:13PM +0100, Jiri Slaby wrote: >>>>>> I have not done that yet, but today, I see: >>>>>> destroy_workqueue: name='req_hci0' pwq=ffff88002f590300 >>>>>> wq->dfl_pwq=ffff88002f591e00 pwq->refcnt=2 pwq->nr_active=0 delayed_works: >>>>>> pwq 12: cpus=0-1 node=0 flags=0x4 nice=-20 active=0/1 >>>>>> in-flight: 18568:wq_barrier_func >>>>> >>>>> So, this means that there's flush_work() racing against workqueue >>>>> destruction, which can't be safe. :( >>>> >>>> But I cannot trigger the WARN_ONs in the attached patch, so I am >>>> confused how this can happen :(. (While I am still seeing the destroy >>>> WARNINGs.) >>> >>> So, no operations should be in progress when destroy_workqueue() is >>> called. If somebody was flushing a work item, the flush call must >>> have returned before destroy_workqueue() was invoked, which doesn't >>> seem to be the case here. Can you trigger BUG_ON() or sysrq-t when >>> the above triggers? There must be a task which is flushing a work >>> item there and it shouldn't be difficult to pinpoint what's going on >>> from it. >> >> The output of sysrq-t is here (> 200k), but I cannot see anything >> suspicious in it: >> http://www.fi.muni.cz/~xslaby/sklad/panics/jctl.txt > > Hmm, so I seem I cannot reproduce with this hunk: > --- a/net/bluetooth/hci_core.c > +++ b/net/bluetooth/hci_core.c > @@ -3139,10 +3139,10 @@ void hci_unregister_dev(struct hci_dev *hdev) > list_del(&hdev->list); > write_unlock(&hci_dev_list_lock); > > - hci_dev_do_close(hdev); > - > cancel_work_sync(&hdev->power_on); > > + hci_dev_do_close(hdev); > + > if (!test_bit(HCI_INIT, &hdev->flags) && > !hci_dev_test_flag(hdev, HCI_SETUP) && > !hci_dev_test_flag(hdev, HCI_CONFIG)) { > > > > I cannot explain why though. I do not see how it matters in this > particular case... > > Dmitry, could you apply it too? But I don't know how often you see the > warning. I've seen it only several times in several months, so I don't it will be helpful.
Powered by blists - more mailing lists