lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250309121324.29633-1-john.madieu.xa@bp.renesas.com>
Date: Sun,  9 Mar 2025 13:13:20 +0100
From: John Madieu <john.madieu.xa@...renesas.com>
To: geert+renesas@...der.be,
	niklas.soderlund+renesas@...natech.se,
	conor+dt@...nel.org,
	krzk+dt@...nel.org,
	robh@...nel.org,
	rafael@...nel.org,
	daniel.lezcano@...aro.org
Cc: magnus.damm@...il.com,
	claudiu.beznea.uj@...renesas.com,
	devicetree@...r.kernel.org,
	john.madieu@...il.com,
	rui.zhang@...el.com,
	linux-kernel@...r.kernel.org,
	linux-renesas-soc@...r.kernel.org,
	biju.das.jz@...renesas.com,
	linux-pm@...r.kernel.org,
	John Madieu <john.madieu.xa@...renesas.com>
Subject: [RFC PATCH 0/3] thermal: Add CPU hotplug cooling driver

MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This patch series introduces a new thermal cooling driver that implements CPU
hotplug-based thermal management. The driver dynamically takes CPUs offline
during thermal excursions to reduce power consumption and prevent overheating,
while maintaining system stability by keeping at least one CPU online. 

1- Problem Statement

Modern SoCs require robust thermal management to prevent overheating under heavy
workloads. Existing cooling mechanisms like frequency scaling may not always
provide sufficient thermal relief, especially in multi-core systems where
per-core thermal contributions can be significant. 

2- Solution Overview 

The driver:

 - Integrates with the Linux thermal framework as a cooling device  
 - Registers per-CPU cooling devices that respond to thermal trip points  
 - Uses CPU hotplug operations to reduce thermal load  
 - Maintains system stability by preserving the boot CPU from being put offline,
 regardless the CPUs that are specified in cooling device list. 
 - Implements proper state tracking and cleanup

Key Features:   

 - Dynamic CPU online/offline management based on thermal thresholds  
 - Device tree-based configuration via thermal zones and trip points  
 - Hysteresis support through thermal governor interactions  
 - Safe handling of CPU state transitions during module load/unload  
 - Compatibility with existing thermal management frameworks

Testing    

 - Verified on Renesas RZ/G3E platforms with multi-core CPU configurations  
 - Validated thermal response using artificial load generation (emul_temp)  
 - Confirmed proper interaction with other cooling devices
 - Verified support for 'plug' type trace events
 - Tested with step_wise governor

As the 'hot' type is already used for user space notification, I've choosen
'plug' for this new type. suggestions on this are welcome. Here is an example
of 'thermal-zone' that integrate 'plug' type:

```
thermal-zones {
	cpu-thermal {
		polling-delay = <1000>;
		polling-delay-passive = <250>;
		thermal-sensors = <&tsu>;

		cooling-maps {
			map0 {
				trip = <&target>;
				cooling-device = <&cpu0 0 3>, <&cpu3 0 3>;
				contribution = <1024>;
			};

			map1 {
				trip = <&trip_emergency>;
				cooling-device = <&cpu1 0 1>, <&cpu2 0 1>;
				contribution = <1024>;
			};

		};

		trips {
			target: trip-point {
				temperature = <95000>;
				hysteresis = <1000>;
				type = "passive";
			};

			trip_emergency: emergency {
				temperature = <110000>;
				hysteresis = <1000>;
				type = "plug";
			};

			sensor_crit: sensor-crit {
				temperature = <120000>;
				hysteresis = <1000>;
				type = "critical";
			};
		};
	};
};
```

Dependencies    

 - Requires standard thermal framework components (CONFIG_THERMAL)  
 - Depends on CPU hotplug support (CONFIG_HOTPLUG_CPU)  
 - Assumes device tree contains appropriate thermal zone definitions

This series also depends upon [1], more precisely on patch 6/7, 
arm64: dts: renesas: r9a09g047: Add TSU node.


3) Notes for Reviewers    

 - Focus areas: Thermal framework integration, CPU state management, and error handling  
 - Feedback on device tree binding requirements is particularly welcome  
 - Suggestions for interaction improvements with other governors are appreciated

I look forward to your feedback and guidance on this contribution.

[1] https://patchwork.kernel.org/project/linux-clk/cover/20250227122453.30480-1-john.madieu.xa@bp.renesas.com/

Regards,
John


John Madieu (3):
  thermal/cpuplog_cooling: Add CPU hotplug cooling driver
  tmon: Add support for THERMAL_TRIP_PLUG type
  arm64: dts: renesas: r9a09g047: Add thermal hotplug trip point

 arch/arm64/boot/dts/renesas/r9a09g047.dtsi |  13 +
 drivers/thermal/Kconfig                    |  12 +
 drivers/thermal/Makefile                   |   1 +
 drivers/thermal/cpuplug_cooling.c          | 363 +++++++++++++++++++++
 drivers/thermal/thermal_of.c               |   1 +
 drivers/thermal/thermal_trace.h            |   2 +
 drivers/thermal/thermal_trip.c             |   1 +
 include/uapi/linux/thermal.h               |   1 +
 tools/thermal/tmon/tmon.h                  |   1 +
 tools/thermal/tmon/tui.c                   |   3 +-
 10 files changed, 397 insertions(+), 1 deletion(-)
 create mode 100644 drivers/thermal/cpuplug_cooling.c

-- 
2.25.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ