[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wiEdcjdceNdfVGPEcbSmAKh_rjtBSy5_Z3Yyx2GFEgLFA@mail.gmail.com>
Date: Wed, 29 Mar 2023 15:52:53 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: LKML <linux-kernel@...r.kernel.org>,
"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
Zhang Rui <rui.zhang@...el.com>
Subject: Re: [BUG v6.3-rc4+] WARNING: CPU: 0 PID: 1 at drivers/thermal/thermal_sysfs.c:879
cooling_device_stats_setup+0xac/0xc0
On Wed, Mar 29, 2023 at 1:58 PM Steven Rostedt <rostedt@...dmis.org> wrote:
>
> In preparation to adding my patch that checks for some kinds of bugs in
> trace events, I decided to run it on the Linus's latest branch, to see if
> there's any other trace events that may cause issues. But instead I hit
> this unrelated bug. Looks to be triggering an added lockdep_assert() on
> boot up.
So I think that lockdep assert is likely bogus.
It was added in commit 790930f44289 ("thermal: core: Introduce
thermal_cooling_device_update()") but the reason I say it's bogus is
that I don't think it has ever been tested:
> static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
> {
> lockdep_assert_held(&cdev->lock); <<<---- line 879
Yeah, so cooling_device_stats_setup() is called from two places:
- thermal_cooling_device_setup_sysfs()
- thermal_cooling_device_stats_reinit()
and that first place is when that cdev is created, before it's
registered anywhere. It's not locked in that case, and yes, the
lockdep_assert_held() will trigger.
As far as I can tell it will always trigger, and this lockdep_assert()
has thus never been tested with lockdep enabled.
The "stats_reinit" case seems to also be called from only one place
(thermal_cooling_device_update()), and that path does indeed hold the
cdev->lock.
That lockdep could be made happy by having
thermal_cooling_device_setup_sysfs() create that device with the cdev
lock held. I guess that's easy enough, although somewhat annoyingly
there is no "mutex_init_locked()", you have to actually do
"mutex_init()" followed by a "mutex_lock()". And obviously unlock it
after doing the setup_sysfs().
But I question whether the lockdep test should be done at all. I find
it distasteful that it was added with absolutely zero testing.
Linus
Powered by blists - more mailing lists