[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y+oHvi5f+tDMPR31@hirez.programming.kicks-ass.net>
Date: Mon, 13 Feb 2023 10:49:50 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Alan Stern <stern@...land.harvard.edu>
Cc: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
syzkaller <syzkaller@...glegroups.com>,
Dmitry Vyukov <dvyukov@...gle.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"Rafael J. Wysocki" <rafael@...nel.org>,
Ingo Molnar <mingo@...hat.com>,
Boqun Feng <boqun.feng@...il.com>,
LKML <linux-kernel@...r.kernel.org>,
USB list <linux-usb@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Hillf Danton <hdanton@...a.com>
Subject: Re: [PATCH RFC] drivers/core: Replace lockdep_set_novalidate_class()
with unique class keys
On Sat, Feb 11, 2023 at 04:41:11PM -0500, Alan Stern wrote:
> Lockdep is blind to the dev->mutex field of struct device, owing to
> the fact that these mutexes are assigned to lockdep's "novalidate"
> class. Commit 1704f47b50b5 ("lockdep: Add novalidate class for
> dev->mutex conversion") did this because the hierarchical nature of
> the device tree makes it impossible in practice to determine whether
> acquiring one of these mutexes is safe or might lead to a deadlock.
>
> Unfortunately, this means that lockdep is unable to help diagnose real
> deadlocks involving these mutexes when they occur in testing [1] [2]
> or in actual use, or to detect bad locking patterns that might lead to
> a deadlock. We would like to obtain as much of lockdep's benefits as
> possible without generating a flood of false positives -- which is
> what happens if one naively removes these mutexes from the
> "novalidate" class.
>
> Accordingly, as a middle ground the mutex in each non-static struct
> device will be placed in its own unique locking class. This approach
> gives up some of lockdep's advantages (for example, all devices having
> a particular bus_type or device_type might reasonably be put into the
> same locking class), but it should at least allow us to gain the
> benefit of some of lockdep's capabilities.
>
> Link: https://syzkaller.appspot.com/bug?extid=2d6ac90723742279e101 [1]
> Link: https://syzkaller.appspot.com/bug?extid=2e39bc6569d281acbcfb [2]
> Link: https://lore.kernel.org/all/28a82f50-39d5-a45f-7c7a-57a66cec0741@I-love.SAKURA.ne.jp/
> Suggested-by: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
> Signed-off-by: Alan Stern <stern@...land.harvard.edu>
> CC: Peter Zijlstra <peterz@...radead.org>
> CC: Ingo Molnar <mingo@...hat.com>
> CC: Boqun Feng <boqun.feng@...il.com>
My main worry when adding a ton of classes like this is the
combinatorics (dynamic classes make this more trivial, but it's entirely
possible with just static classes too).
As an example, we used to have a static class per cpu runqueue, now,
given migration takes two runqueue locks (per locking rules in cpu
order -- source and dest runqueue etc..) we got O(n^2) combinations of
class orderings, which lead to a giant graph, which led to both
performance and memory usage issues.
>From this was born the subclass, which reduced the whole thing to a
static ordering of runqueue-1 goes inside runqueue-0.
Similar combinatoric issues have cropped up in other places from time to
time, typically solved by using a different annotation.
Now, given the whole device thing is more or less a tree, hierarchical
locking should limit that, the only worry is that sibling locking while
holding parent thing. If siblings are of different classes, things will
both complain and create combinatorics again.
Powered by blists - more mailing lists