[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3ba7a573-9343-4ed4-b805-1fb8db8df2c6@gmx.de>
Date: Sat, 15 Nov 2025 21:07:24 +0100
From: Armin Wolf <W_Armin@....de>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: Daniel Lezcano <daniel.lezcano@...aro.org>,
Zhang Rui <rui.zhang@...el.com>, Lukasz Luba <lukasz.luba@....com>,
Hans de Goede <hansg@...nel.org>,
Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-acpi@...r.kernel.org
Subject: Re: [PATCH RFC 0/8] thermal: core: Allow setting the parent device of
thermal zone/cooling devices
Am 14.11.25 um 21:10 schrieb Rafael J. Wysocki:
> CC list trimmed and I'd rather not use such an extensive one if I were you.
>
> On Fri, Nov 14, 2025 at 1:13 PM Rafael J. Wysocki <rafael@...nel.org> wrote:
>> On Fri, Nov 14, 2025 at 4:24 AM Armin Wolf <W_Armin@....de> wrote:
>>> Drivers registering thermal zone/cooling devices are currently unable
>>> to tell the thermal core what parent device the new thermal zone/
>>> cooling device should have, potentially causing issues with suspend
>>> ordering
>> Do you have any examples of this?
> Especially for thermal zones.
The device core suspends child devices before parent devices in order to avoid
child devices accessing an already suspended parent device. Since thermal zone
and cooling devices have no parent, they could potentially suspended after their
parent.
I said "potentially" because currently the thermal subsystem handles suspend/resume
using a PM notifier, something that prevents the above problem from occurring. We
should however eventually migrate to dev_pm_ops for that, so the device core needs
to know about parent-child dependencies between thermal zone/cooling devices and their
respective parent devices.
>>> and making it impossible for user space appications to
>>> associate a given thermal zone device with its parent device.
>>>
>>> This patch series aims to fix this issue by extending the functions
>>> used to register thermal zone/cooling devices to also accept a parent
>>> device pointer. The first six patches convert all functions used for
>>> registering cooling devices, while the functions used for registering
>>> thermal zone devices are converted by the remaining two patches.
>>>
>>> I tested this series on various devices containing (among others):
>>> - ACPI thermal zones
>>> - ACPI processor devices
>>> - PCIe cooling devices
>>> - Intel Wifi card
>>> - Intel powerclamp
>>> - Intel TCC cooling
>>>
>>> I also compile-tested the remaining affected drivers, however i would
>>> still be happy if the relevant maintainers (especially those of the
>>> mellanox ethernet switch driver) could take a quick glance at the
>>> code and verify that i am using the correct device as the parent
>>> device.
>>>
>>> This work is also necessary for extending the ACPI thermal zone driver
>>> to support the _TZD ACPI object in the future.
>> Can you please elaborate a bit here?
>>
>> _TZD is a list of devices that belong to the given thermal zone, so
>> how is it connected to the thermal zone parent?
The ACPI thermal zone driver currently matches cooling devices by accessing their
private drvdata and checking if it is a pointer to the correct ACPI device. This
work well enough for ACPI fans and processors, but will likely not work for other
cooling devices (like batteries). Such cooling devices are supposed to be listed
by the _TZD ACPI object, so we need a more generic matching algorithm before adding
support for said ACPI object.
I as thinking of modifying the ACPI thermal zone driver to instead use the ACPI handle
of the parent device for matching cooling devices. This would solve the problem described
above-
>>> Signed-off-by: Armin Wolf <W_Armin@....de>
>>> ---
>>> Armin Wolf (8):
>>> thermal: core: Allow setting the parent device of cooling devices
>>> thermal: core: Set parent device in thermal_of_cooling_device_register()
>>> ACPI: processor: Stop creating "device" sysfs link
>>> ACPI: fan: Stop creating "device" sysfs link
>>> ACPI: video: Stop creating "device" sysfs link
>>> thermal: core: Set parent device in thermal_cooling_device_register()
>>> ACPI: thermal: Stop creating "device" sysfs link
> This will kind of break things because user space may rely on those, may it not?
The driver core will create the "device" sysfs link for us as soon as we populate the
parent device pointer of the thermal zone/cooling device. So user space application
relying on those links should continue to work.
I even tested this on my devices.
>>> thermal: core: Allow setting the parent device of thermal zone devices
> For this last change, you need to define what it means for a thermal
> zone to have a parent device. In particular, in what way would a
> thermal zone depend on its parent?
1. A thermal zone should be suspended before the parent device is suspended. For this the
device core needs to know the parent device of a given thermal zone device.
2. Users space applications can determine the physical device behind a given thermal zone device
if we enable the device core to automatically create a "device" sysfs link.
Other than that all current thermal zone devices do not depend on the parent device pointer
(because currently it is always NULL).
>> I can only see the first three patches in the series ATM as per
>>
>> https://lore.kernel.org/linux-pm/20251114-thermal-device-v1-0-d8b442aae38b@gmx.de/T/#r605b23f2e27e751d8406e7949dad6f5b5b112067
> That's probably because of the excessive CC list.
Yes, that was my mistake. I will prune the CC list and resend the series. Do you think that i should include
all maintainers of the affected drivers or only the subsystem maintainers?
Thanks,
Armin Wolf
Powered by blists - more mailing lists