[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <de3aae46-58bd-7041-df49-10cfb6be1603@redhat.com>
Date: Thu, 17 Nov 2022 15:10:25 +0100
From: Hans de Goede <hdegoede@...hat.com>
To: Mario Limonciello <mario.limonciello@....com>,
"Rafael J . Wysocki" <rafael@...nel.org>,
Stephen Boyd <sboyd@...nel.org>,
platform-driver-x86@...r.kernel.org, linux-pm@...r.kernel.org,
linux-kernel@...r.kernel.org
Cc: Sven van Ashbrook <svenva@...omium.org>,
Raul Rangel <rrangel@...omium.org>,
Pavel Machek <pavel@....cz>, Len Brown <len.brown@...el.com>,
John Stultz <jstultz@...gle.com>,
Thomas Gleixner <tglx@...utronix.de>,
Rajneesh Bhardwaj <irenic.rajneesh@...il.com>,
S-k Shyam-sundar <Shyam-sundar.S-k@....com>,
Rajat Jain <rajatja@...gle.com>,
David E Box <david.e.box@...el.com>
Subject: Re: [RFC v3 0/4] Make it easier to measure % in HW sleep state
Hi,
On 11/15/22 21:01, Mario Limonciello wrote:
> Sven van Ashbrook brought a patch to the kernel mailing list that
> attempted to change the reporting level of a s0ix entry issue to a
> different debugging level so that infastructure used by Google could
> better scan logs to catch problems.
>
> This approach was rejected, but during the conversation another
> suggestion was made by David E. Box to introduce some infrastructure
> into the kernel to report this information.
>
> As it's information that is reported by both AMD and Intel platforms
> over s2idle, this seems to make sense.
>
> RFC v1 and v2 introduced two new sysfs files to report the information, but
> Rafael pointed out that there was already a file that could be used on
> Intel platforms: `low_power_idle_system_residency_us`.
>
> RFC v3 creates this file for AMD platforms and also introduces another file
> that can be used to determine total sleep time:
> `/sys/power/suspend_stats/last_total`.
>
> With these two files a simple shell script can be run after suspend to
> calculate the percentage.
>
> ```
> #!/bin/sh
> total=$(cat /sys/power/suspend_stats/last_total)
> hw=$(cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us)
> percent=$(awk -v hw=$hw -v total=$total 'BEGIN { printf "%.2f%%", (hw/total*100) }')
> echo "Last ${total}us suspend cycle spent $percent of the time in a hardware sleep state."
> ```
>
> A sample run on an AMD platform that was just sleeping with this series on
> top of 6.1-rc5 shows the following:
> # ./compare.sh
> Last 15699838us suspend cycle spent 98.63% of the time in a hardware sleep state.
>
> Further discussion to be expected on this series:
>
> * What last_total will represent from the suspend cycle
>
> * Whether the semantics of all platforms will be the same for
> `low_power_idle_system_residency_us`
> - AMD platforms reset this counter before s2idle entry. Do Intel? Others?
>
> * Maybe the *kernel* should be responsible to do the calculation and export
> a `last_hw_sleep_percent` file instead. Platform differences can be
> abstracted then within individual drivers.
That (`last_hw_sleep_percent` file) is an interesting proposal,
I can see that being a better interface because as you say this allows
the kernel / platform-drivers to take care of any platform quirks /
weirdness, avoiding any userspace monitoring of this to possibly
give false positive warnings.
Regards,
Hans
>
> Mario Limonciello (4):
> PM: Add a sysfs file to represent the total sleep duration
> platform/x86/intel/pmc: core: Drop check_counters
> platform/x86/amd: pmc: Report duration of time in deepest hw state
> platform/x86/amd: pmc: Populate cpuidle sysfs file with hw sleep data
>
> Documentation/ABI/testing/sysfs-amd-pmc | 6 ++++++
> Documentation/ABI/testing/sysfs-power | 8 ++++++++
> drivers/platform/x86/amd/pmc.c | 27 ++++++++++++++++++++++---
> drivers/platform/x86/intel/pmc/core.c | 7 ++-----
> drivers/platform/x86/intel/pmc/core.h | 1 -
> include/linux/suspend.h | 2 ++
> kernel/power/main.c | 15 ++++++++++++++
> kernel/power/suspend.c | 2 ++
> kernel/time/timekeeping.c | 2 ++
> 9 files changed, 61 insertions(+), 9 deletions(-)
>
Powered by blists - more mailing lists