lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 30 Nov 2020 08:57:55 +0100
From:   Daniel Lezcano <daniel.lezcano@...aro.org>
To:     Kai-Heng Feng <kai.heng.feng@...onical.com>, rui.zhang@...el.com,
        amitk@...nel.org
Cc:     "open list:THERMAL" <linux-pm@...r.kernel.org>,
        open list <linux-kernel@...r.kernel.org>,
        Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
Subject: Re: [PATCH 1/3] thermal: core: Add indication for userspace usage


[Added Srinivas]

On 28/11/2020 18:54, Kai-Heng Feng wrote:
> We are seeing thermal shutdown on Intel based mobile workstations, the
> shutdown happens during the first trip handle in
> thermal_zone_device_register():
> kernel: thermal thermal_zone15: critical temperature reached (101 C), shutting down
> 
> However, we shouldn't do a thermal shutdown here, since
> 1) We may want to use a dedicated daemon, Intel's thermald in this case,
> to handle thermal shutdown.
> 
> 2) For ACPI based system, _CRT doesn't mean shutdown unless it's inside
> ThermalZone. ACPI Spec, 11.4.4 _CRT (Critical Temperature):
> "... If this object it present under a device, the device’s driver
> evaluates this object to determine the device’s critical cooling
> temperature trip point. This value may then be used by the device’s
> driver to program an internal device temperature sensor trip point."
> 
> So a "critical trip" here merely means we should take a more aggressive
> cooling method.

Well, actually it is stated before:

"This object, when defined under a thermal zone, returns the critical
temperature at which OSPM must shutdown the system".

That is what does the thermal subsystem, no ?

> So add an indication to let thermal core know it should leave thermal
> device to userspace to handle.

You may want to check the 'HOT' trip point and then use the notification
mechanism to get notified in userspace and take action from there (eg.
offline some CPUs).

> Signed-off-by: Kai-Heng Feng <kai.heng.feng@...onical.com>
> ---
>  drivers/thermal/thermal_core.c | 3 +++
>  include/linux/thermal.h        | 2 ++
>  2 files changed, 5 insertions(+)
> 
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index c6d74bc1c90b..6561e3767529 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -1477,6 +1477,9 @@ thermal_zone_device_register(const char *type, int trips, int mask,
>  			goto unregister;
>  	}
>  
> +	if (tz->tzp && tz->tzp->userspace)
> +		thermal_zone_device_disable(tz);
> +
>  	mutex_lock(&thermal_list_lock);
>  	list_add_tail(&tz->node, &thermal_tz_list);
>  	mutex_unlock(&thermal_list_lock);
> diff --git a/include/linux/thermal.h b/include/linux/thermal.h
> index d07ea27e72a9..e8e8fac78fc8 100644
> --- a/include/linux/thermal.h
> +++ b/include/linux/thermal.h
> @@ -247,6 +247,8 @@ struct thermal_zone_params {
>  	 */
>  	bool no_hwmon;
>  
> +	bool userspace;
> +
>  	int num_tbps;	/* Number of tbp entries */
>  	struct thermal_bind_params *tbp;
>  
> 


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ