lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1903691.tdWV9SEqCh@rjwysocki.net>
Date: Tue, 13 Aug 2024 16:23:35 +0200
From: "Rafael J. Wysocki" <rjw@...ysocki.net>
To: Linux PM <linux-pm@...r.kernel.org>
Cc: LKML <linux-kernel@...r.kernel.org>, Lukasz Luba <lukasz.luba@....com>,
 Daniel Lezcano <daniel.lezcano@...aro.org>, Zhang Rui <rui.zhang@...el.com>,
 Peter Kästle <peter@...e.net>
Subject:
 [PATCH v1 0/4] thermal: gov_bang_bang: Prevent cooling devices from getting
 stuck in the "on" state

Hi Everyone,

After changes made in 6.10, the Bang-bang governor has an initialization problem
on systems where cooling devices start in the "on" state, but the thermal zone
temperature stays below the corresponding trip points.

Namely, the Bang-bang governor only implements a .trip_crossed() callback which
only runs when a trip point is crossed.  If the zone temperature is always below
the trip point, that callback will never be invoked.  Now, if a cooling device
bound to that trip point starts in the "on" state, the governor has no chance
to change its state to "off".

This currently happens in the acerhdf driver, but it may as well happen elsewhere,
so I think that it needs to be addressed in the thermal subsystem.

It can be addressed by adding a .manage() callback to the Bang-bang governor,
which is done in patch [3/4].  That callback will be invoked every time
__thermal_zone_device_update() runs, not just when a trip is crossed, so it
can adjust the states of the cooling devices to the thermal zone temperature.
However, after running once, it becomes a pure needless overhead because the
states of cooling devices only need to be fixed up once (modulo some special
situations like system resume).

That's addressed in patch [4/4] which uses governor_data to store the information
on whether or not the states of the cooling devices will need to be adjusted.

Patches [1-2/4] are preliminary, but IMV it is better to make these changes
separately for clarity, but also in case they turn out to have a functional
effect which is not expected.

Overall, this series is a fix candidate for 6.11-rc because the change in
behavior addressed by it can be regarded as a regression with respect to 6.9.

Unfortunately, it affects this series:

https://lore.kernel.org/linux-pm/114901234.nniJfEyVGO@rjwysocki.net/

which will need to be reordered and rebased (slightly), but because I've dropped
one broken patch from it already, it will need to be changed anyway.

Thanks!




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ