lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2bc5a902-acde-526a-11a5-2357d899916c@linaro.org>
Date:   Thu, 9 Apr 2020 12:29:35 +0200
From:   Daniel Lezcano <daniel.lezcano@...aro.org>
To:     Andrzej Pietrasiewicz <andrzej.p@...labora.com>,
        linux-pm@...r.kernel.org
Cc:     Zhang Rui <rui.zhang@...el.com>,
        "Rafael J . Wysocki" <rjw@...ysocki.net>,
        Len Brown <lenb@...nel.org>, Jiri Pirko <jiri@...lanox.com>,
        Ido Schimmel <idosch@...lanox.com>,
        "David S . Miller" <davem@...emloft.net>,
        Peter Kaestle <peter@...e.net>,
        Darren Hart <dvhart@...radead.org>,
        Andy Shevchenko <andy@...radead.org>,
        Support Opensource <support.opensource@...semi.com>,
        Amit Kucheria <amit.kucheria@...durent.com>,
        Shawn Guo <shawnguo@...nel.org>,
        Sascha Hauer <s.hauer@...gutronix.de>,
        Pengutronix Kernel Team <kernel@...gutronix.de>,
        Fabio Estevam <festevam@...il.com>,
        NXP Linux Team <linux-imx@....com>,
        Allison Randal <allison@...utok.net>,
        Enrico Weigelt <info@...ux.net>,
        Gayatri Kammela <gayatri.kammela@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-acpi@...r.kernel.org, netdev@...r.kernel.org,
        platform-driver-x86@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org, kernel@...labora.com
Subject: Re: [RFC 0/8] Stop monitoring disabled devices

On 07/04/2020 19:49, Andrzej Pietrasiewicz wrote:
> The current kernel behavior is to keep polling the thermal zone devices
> regardless of their current mode. This is not desired, as all such "disabled"
> devices are meant to be handled by userspace,> so polling them makes no sense.

Thanks for proposing these changes.

I've been (quickly) through the series and the description below. I have
the feeling the series makes more complex while the current code which
would deserve a cleanup.

Why not first:

 - Add a 'mode' field in the thermal zone device
 - Kill all set/get_mode callbacks in the drivers which are duplicated code.
 - Add a function:

 enum thermal_device_mode thermal_zone_get_mode( *tz)
 {
	...
	if (tz->ops->get_mode)
		return tz->ops->get_mode();

	return tz->mode;
 }


 int thermal_zone_set_mode(..*tz, enum thermal_device_mode mode)
 {
	...
	if (tz->ops->set_mode)
		return tz->ops->set_mode(tz, mode);

	tz->mode = mode;

	return 0;
 }

 static inline thermal_zone_enable(... *tz)
 {
	thermal_zone_set_mode(tz, THERMAL_DEVICE_ENABLED);
 }

 static inline thermal_zone_disable(... *tz) {
	thermal_zone_set_mode(tz, THERMAL_DEVICE_DISABLED);
 }

And then when the code is consolidated, use the mode to enable/disable
the polling and continue killing the duplicated code in of-thermal.c and
anywhere else.


> There was an attempt to solve this issue:
> 
> https://lkml.org/lkml/2018/2/26/498
> 
> and it ultimately has not succeeded:
> 
> https://lkml.org/lkml/2018/2/27/910
> 
> This is a new attempt addressing all the relevant drivers, and I have
> identified them with:
> 
> $ git grep "thermal_zone_device_ops" | grep "= {" | cut -f1 -d: | sort | uniq
> 
> The idea is to modify thermal_zone_device_update() and monitor_thermal_zone()
> in such a way that they stop polling a disabled device. To do decide what to
> do they should call ->get_mode() operation of the specialized thermal zone
> device in question (e.g. drivers/acpi/thermal.c's). But here comes problem:
> sometimes a thermal zone device must be initially disabled and becomes enabled
> only after its sensors appear on the system. If such thermal zone's
> ->get_mode() /* in the context of thermal_zone_device_update() or
> monitor_thermal_zone() */ is called _before_ the sensors are available, it will
> be reported as "disabled" and consequently polling it will be ceased. This is
> a change in behavior from userspace's perspective.
> 
> To solve the above described problem I want to introduce the third mode of a
> thermal_zone_device: initial. The idea is that when the device is in its
> initial mode, then its polling will be handled as it is now. This is a good
> thing: should the temperature be just about hitting the critical treshnold
> early during the boot process, it might be too late if we wait for the
> userspace to run to save the system from overheating. The initial mode should
> be reported in sysfs as "enabled" to keep the userspace interface intact.
> From the initial mode there will be two possible transitions: to enabled or
> disabled mode, but there will be no transition back to initial. If the
> transition is from initial to enabled, then keep polling. If the transition is
> from initial to disabled, then stop polling. If the transition is from enabled
> to disabled, then stop polling. The transition from disabled to enabled must
> be handled in a special way: there must be a mandatory call to
> monitor_thermal_zone(), otherwise the polling will not start. If this
> transition is triggeted from sysfs, then it can be easily handled at the
> thermal framework level. However, if drivers call their own ->set_mode()
> operation then they must also call "monitor_thermal_zone()" afterwards.
> The latter being a sensible thing anyway, so perhaps all/most of the drivers
> in question do. The plan for implementation is this:
> 
> - ensure ALL users use symbolic enum names (THERMAL_DEVICE_DISABLED,
> THERMAL_DEVICE_ENABLED) for thermal device mode rather than the numeric
> values of enum thermal_device_mode elements
> - add THERMAL_DEVICE_INITIAL to the said enum making its value 0 (so that
> kzalloc() results in the initial state)
> - modify thermal zone device's mode_show() (thermal framework level) so that
> it reports "enabled" for THERMAL_DEVICE_INITIAL
> - modify thermal zone device's mode_store() (thermal framework level) so that
> it calls monitor_thermal_zone() upon mode change
> - modify ALL thermal drivers so that their code is prepared to return
> THERMAL_DEVICE_INITIAL before they call thermal_zone_device_register(); when
> the invocation of the latter completes then polling is expected to be started
> - verify ALL drivers which call their own ->set_mode() to ensure they do call
> monitor_thermal_zone() afterwards
> - modify thermal_zone_device_update() and monitor_thermal_zone() so that they
> cancel polling for disabled thermal zone devices (but not for those in
> THERMAL_DEVICE_INITIAL mode)
> 
> This RFC series does all the above steps in more or less that order.
> 
> I kindly ask for comments/suggestions/improvements.
> 
> Rebased onto v5.6.
> 
> Andrzej Pietrasiewicz (8):
>   thermal: int3400_thermal: Statically initialize
>     .get_mode()/.set_mode() ops
>   thermal: Properly handle mode values in .set_mode()
>   thermal: Store thermal mode in a dedicated enum
>   thermal: core: Introduce THERMAL_DEVICE_INITIAL
>   thermal: core: Monitor thermal zone after mode change
>   thermal: Set initial state to THERMAL_DEVICE_INITIAL
>   thermal: of: Monitor thermal zone after enabling it
>   thermal: Stop polling DISABLED thermal devices
> 
>  drivers/acpi/thermal.c                        | 28 +++++-----
>  .../ethernet/mellanox/mlxsw/core_thermal.c    | 11 +++-
>  drivers/platform/x86/acerhdf.c                | 17 ++++--
>  drivers/thermal/da9062-thermal.c              |  2 +-
>  drivers/thermal/imx_thermal.c                 |  5 +-
>  .../intel/int340x_thermal/int3400_thermal.c   | 24 ++++-----
>  .../thermal/intel/intel_quark_dts_thermal.c   |  6 ++-
>  drivers/thermal/of-thermal.c                  |  9 +++-
>  drivers/thermal/thermal_core.c                | 52 ++++++++++++++++++-
>  drivers/thermal/thermal_core.h                |  2 +
>  drivers/thermal/thermal_sysfs.c               | 12 +++--
>  include/linux/thermal.h                       |  3 +-
>  12 files changed, 123 insertions(+), 48 deletions(-)
> 


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ