lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <93448681-8a0b-f565-1a98-a8607ff37488@linaro.org>
Date:   Tue, 20 Jun 2023 20:35:00 +0200
From:   Daniel Lezcano <daniel.lezcano@...aro.org>
To:     "Rafael J. Wysocki" <rafael@...nel.org>
Cc:     linux-pm@...r.kernel.org, thierry.reding@...il.com,
        Amit Kucheria <amitk@...nel.org>,
        Zhang Rui <rui.zhang@...el.com>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 4/8] thermal/core: Update the generic trip points


Hi Rafael,

thanks for the comments


On 20/06/2023 13:28, Rafael J. Wysocki wrote:
> On Thu, May 25, 2023 at 4:02 PM Daniel Lezcano
> <daniel.lezcano@...aro.org> wrote:
>>
>> At this point, the generic trip points rework allows to create a
>> thermal zone with a fixed number of trip points. This usage satisfy
>> almost all of the existing drivers.
>>
>> A few remaining drivers have a mechanism where the firmware updates
>> the trip points. But there is no such update mechanism for the generic
>> trip points, thus those drivers can not be converted to the generic
>> approach.
>>
>> This patch provides a function 'thermal_zone_trips_update()' allowing
>> to change the trip points of a thermal zone.
>>
>> At the same time, with the logic the trip points array is passed as a
>> parameter to the thermal zone at creation time, we make our own
>> private trip points array by copying the one passed as parameter.
> 
> So the design seems to require the caller to create a new array of
> trip points and pass it to thermal_zone_trips_update(), so it can
> replace the zone's trips array with it.
> 
> If only one trip point changes and there are multiple defined, this is
> rather not efficient.

This update is only for replacing the current trip array when one or 
several trip points are added or removed. We can see that in one or two 
drivers only.

This function is supposed to be called rarely (and I doubt there is 
really a lot of hardware sending notification to add/remove trip points).

For changing a trip point property like its temperature or its 
hysteresis, we keep the usual set_trip_point() function.

> Do you want to prevent the core from using stale trip points this way?
>   If so, it should be stated here.

No, that will be a side effect. We can put this point apart, it will be 
addressed in a cleanup series after everything is converted to the 
generic trip points.


> Moreover, at least in the cases when num_trips doesn't change, it
> might be more efficient to walk the new trips[] array and only copy
> the ones that have changed over their old versions.

IMO, that is over-engineered, especially for dedicating this 
optimization for a very few drivers and ultra rare usages.


> I am also not sure if this interface is going to be convenient from
> the user's perspective, especially if the trips get sorted by the core
> (in the future).  They would need to recreate the entire trips array
> every time from scratch, even if only one trip point changes, which
> means quite a bit of churn for thermal zones with many trip points.

Actually, the driver is not supposed to deal with the array. It can 
create the array on the stack, pass it to the 
thermal_zone_device_register_with_trips() function and forget about it.

The trip points array should not be used by a driver anymore.


> It might be better to allow them to update trips in place and notify
> the core about the change, all under the zone lock to prevent the core
> from using trips simultaneously.

I'm not sure to understand. The core code is called with this function 
and takes the lock.

> And arguably, changing num_trips would be questionable due to the
> sysfs consistency reasons mentioned below.

[ ... ]

>> Note, no code has been found where the trip points update leads to a
>> refresh of the trip points in sysfs, so it is very unlikey the number
>> of trip points changes. However, for the sake of consistency it would
>> be nicer to have the trip points being refreshed in sysfs also, but
>> that could be done in a separate set of changes.
> 
> So at this point user space has already enumerated the trip points, so
> it may fail if some of them go away or it may not be able to use any
> new trip points appearing in sysfs.

Yes, that is why I think the adding/removal of the trip points was never 
really supported. I would be very curious to see a platform with such a 
feature.

> For this reason, until there is a way to notify user space about the
> need to re-enumerate trip points (and user space indicates support for
> it), the only trip point property that may change in sysfs is the
> temperature.

The userspace can be notified when there is a change with:

THERMAL_GENL_EVENT_TZ_TRIP_CHANGE
THERMAL_GENL_EVENT_TZ_TRIP_ADD
THERMAL_GENL_EVENT_TZ_TRIP_DELETE

The last two ones are not implemented today but that could be done after 
as that would be a new feature.

Let me summarize the situation:

  - The trip point crossing events are not correctly detected because of 
how they are handled and we need to sort them out.

  - In order to sort them out, we need to convert the drivers to the 
generic trip point and remove all those get_trip_* | set_trip_* ops

  - Almost all the drivers are converted except the ACPI thermal and the 
intel_soc_dts_iosf drivers which are blocking the feature

  - The ACPI thermal driver can potentially add or remove a trip point 
but we are not sure that can happen

  - We need to decorrelate the trip id and the array index for the ACPI 
thermal driver

The generic trip points change is a big chunk of work and I would like 
to have some progress to fix the trip crossing detection along with the 
removal of the resulting dead code.

Given there may not be a real usage of the thermal trip number update, 
can we stay simple and keep the proposed change but forcing the same 
number of trip points ?

We can then improve the existing code if it is really needed


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ