[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <917e1e3b-094f-e594-c1a2-8b97fb5195fd@I-love.SAKURA.ne.jp>
Date: Sun, 5 Feb 2023 02:09:40 +0900
From: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To: Alan Stern <stern@...land.harvard.edu>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"Rafael J. Wysocki" <rafael@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
USB list <linux-usb@...r.kernel.org>,
Hillf Danton <hdanton@...a.com>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: Converting dev->mutex into dev->spinlock ?
On 2023/02/05 1:27, Alan Stern wrote:
> On Sun, Feb 05, 2023 at 01:12:12AM +0900, Tetsuo Handa wrote:
>> On 2023/02/05 0:34, Alan Stern wrote:
>>>> A few of examples:
>>>>
>>>> https://syzkaller.appspot.com/bug?extid=2d6ac90723742279e101
>>>
>>> It's hard to figure out what's wrong from looking at the syzbot report.
>>> What makes you think it is connected with dev->mutex?
>>>
>>> At first glance, it seems that the ath6kl driver is trying to flush a
>>> workqueue while holding a lock or mutex that is needed by one of the
>>> jobs in the workqueue. That's obviously never going to work, no matter
>>> what sort of lockdep validation gets used.
>>
>> That lock is exactly dev->mutex where lockdep validation is disabled.
>> If lockdep validation on dev->mutex were not disabled, we can catch
>> possibility of deadlock before khungtaskd reports real deadlock as hung.
>>
>> Lockdep validation on dev->mutex being disabled is really annoying, and
>> I want to make lockdep validation on dev->mutex enabled; that is the
>> "drivers/core: Remove lockdep_set_novalidate_class() usage" patch.
>
>> Even if it is always safe to acquire a child device's lock while holding
>> the parent's lock, disabling lockdep checks completely on device's lock is
>> not safe.
>
> I understand the problem you want to solve, and I understand that it
> can be frustrating. However, I do not believe you will be able to
> solve this problem.
That is a declaration that driver developers are allowed to take it for granted
that driver callback functions can behave as if dev->mutex is not held.
Some developers test their changes with lockdep enabled, and believe that their
changes are correct because lockdep did not complain.
https://syzkaller.appspot.com/bug?extid=9ef743bba3a17c756174 is an example.
We should somehow update driver core code to make it possible to keep lockdep
checks enabled on dev->mutex.
Powered by blists - more mailing lists