lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 2 Apr 2012 10:44:43 +0200
From:	David Herrmann <dh.herrmann@...glemail.com>
To:	Alexander Holler <holler@...oftware.de>
Cc:	Andrei Emeltchenko <andrei.emeltchenko.news@...il.com>,
	linux-bluetooth@...r.kernel.org, linux-kernel@...r.kernel.org,
	"Gustavo F. Padovan" <padovan@...fusion.mobi>
Subject: Re: bluetooth: fix deadlock on device reset and power down

Hi Andrei and Alexander

On Mon, Apr 2, 2012 at 10:29 AM, Alexander Holler <holler@...oftware.de> wrote:
> Am 02.04.2012 08:55, schrieb Andrei Emeltchenko:
>> Hi Alexander,
>>
>> On Sat, Mar 31, 2012 at 03:23:38PM +0200, Alexander Holler wrote:
>>> I've experienced a deadlock on shutdown using kernel 3.3 and tracked
>>> it down. Because I'm not very familiar with the bluetooth stack I'm
>>> not sure if the below patch is correct, but it fixed the problem
>>> here.
>>
>> Could you please attach deadlock dump?
>>
>>>
>>> Commit 09fd0de5bd8f8ef3317e5365f92f1a13dcd89aa9 introduced a deadlock:
>>>
>>> bluetoothd calls ioctl HCIDEVDOWN
>>>     hci_sock_ioctl()
>>>         hci_dev_close()
>>>             hci_dev_do_close()
>>>                 hci_dev_lock(hdev);
>>>                 inquiry_cache_flush();
>>>                 hci_conn_hash_flush();
>>>                     hci_conn_del()
>>>                         cancel_delayed_work_sync()
>>>                             hci_conn_timeout()
>>>                                 hci_dev_lock(hdev); /* DEADLOCK */
>>
>> I am actually not sure that hci_conn_timeout locks hdev. Why do you think
>> so?
>
> By reading the source, printk and suffering through the deadlock. It's
> especially painfull when using a bt-keyboard and systemd, because
> systemd tries 4 times (~ some minutes) to kill bluetoothd before it
> marks the service as failed and finally continues to shut down.

hci_conn_timeout does lock the device. See the source. But the problem
here is actually a race-condition, too. The do_close() code locks the
device and then cancels all workqueues in a synchronous manner.
However, the hci_conn_timeout work might get started exactly before
calling cancel_delayed_work_sync(). The proper fix would probably be
releasing the lock before calling "cancel_delayed_work_sync()".
However, then we need to make sure that the work is not restarted
while we do not have the lock.
I think we recently introduced some flag that is set while closing a
device. How about checking that in hci_conn_timeout before aquiring
the lock?

> Just try to kill bluetoothd while a bt-mouse or bt-keyboard is connected.

Reproducable, indeed.

> But I have to admit, that my patch is likely the wrong solution as I
> think it will introduce some race conditions. Anyway, I prefer to live
> with them (the race conditions) instead of the deadlock. So for
> inclusion into the kernel a proper solution is needed.
> But already said, I'm not familiar with the bt-stack and don't know
> about the locking strategies inside the stack, so it's hard for me to
> find my way through the source.

Yes, your fix introduces races. We need to hold the lock there!
Applying your fix would introduce harder to trace bugs even during
runtime so we need to fix this properly.

> Regards,
>
> Alexander

Thanks
David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ