lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <56150451.4060505@redhat.com> Date: Wed, 07 Oct 2015 07:38:57 -0400 From: Prarit Bhargava <prarit@...hat.com> To: "Rafael J. Wysocki" <rjw@...ysocki.net> CC: Kristen Carlson Accardi <kristen@...ux.intel.com>, linux-kernel@...r.kernel.org, Viresh Kumar <viresh.kumar@...aro.org>, linux-pm@...r.kernel.org Subject: Re: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch On 10/06/2015 07:06 PM, Rafael J. Wysocki wrote: > On Wednesday, October 07, 2015 12:43:55 AM Rafael J. Wysocki wrote: >> On Tuesday, October 06, 2015 05:49:07 PM Prarit Bhargava wrote: >>> Intel CPUs will not enter higher p-states when after switching from the >>> performance governor to the powersave governor, until >>> /sys/devices/system/cpu/intel_pstate/min_perf_pct is set to a low value. >>> This differs from previous behaviour in which a switch to the powersave >>> governor would result in a low default value for min_perf_pct. >>> >>> The behavior of the powersave governor changed after commit a04759924e25 >>> ("[cpufreq] intel_pstate: honor user space min_perf_pct override on >>> resume"). The commit introduced tracking of performance percentage >>> changes via sysfs in order to restore userspace changes during >>> suspend/resume. The problem occurs because the global values of the newly >>> introduced max_sysfs_pct and min_sysfs_pct are not reset on a governor >>> change and this causes the new governor to inherit the previous governor's >>> settings. >>> >>> This patch sets max_sysfs_pct to 100 and min_sysfs_pct to 0 on a governor >>> change which fixes the problem with governor switching. These changes >>> also make the initial calculations for max_perf_pct and min_perf_pct >>> slightly simpler. >>> >>> Before patch: >>> [root@...el-skylake-y-01 power]# cpupower frequency-set -g performance >>> [root@...el-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct >>> 100 >>> [root@...el-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct >>> 100 >>> [root@...el-skylake-y-01 power]# cpupower frequency-set -g powersave >>> [root@...el-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct >>> 100 >>> [root@...el-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct >>> 100 >>> >>> After patch: >>> [root@...el-skylake-y-01 power]# cpupower frequency-set -g performance >>> [root@...el-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct >>> 100 >>> [root@...el-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct >>> 100 >>> [root@...el-skylake-y-01 power]# cpupower frequency-set -g powersave >>> [root@...el-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct >>> 14 >>> [root@...el-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct >>> 100 >>> >>> Also note that I have tested suspend/resume (using CONFIG_PM_DEBUG): >>> [root@...el-skylake-y-01 power]# echo 50 > /sys/devices/system/cpu/intel_pstate/min_perf_pct >>> [root@...el-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/*_perf_pct >>> 100 >>> 50 >>> [root@...el-skylake-y-01 power]# echo devices > /sys/power/pm_test >>> [root@...el-skylake-y-01 power]# echo platform > /sys/power/disk >>> [root@...el-skylake-y-01 power]# echo disk > /sys/power/state >>> [root@...el-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/*_perf_pct >>> 100 >>> 50 >>> >>> Fixes: a04759924e25 ("[cpufreq] intel_pstate: honor user space min_perf_pct override on resume") >>> Cc: Kristen Carlson Accardi <kristen@...ux.intel.com> >>> Cc: "Rafael J. Wysocki" <rjw@...ysocki.net> >>> Cc: Viresh Kumar <viresh.kumar@...aro.org> >>> Cc: linux-pm@...r.kernel.org >>> Signed-off-by: Prarit Bhargava <prarit@...hat.com> >>> --- >>> drivers/cpufreq/intel_pstate.c | 7 +++++-- >>> 1 file changed, 5 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c >>> index 3af9dd7..bb24458 100644 >>> --- a/drivers/cpufreq/intel_pstate.c >>> +++ b/drivers/cpufreq/intel_pstate.c >>> @@ -986,6 +986,9 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) >>> if (!policy->cpuinfo.max_freq) >>> return -ENODEV; >>> >>> + limits.min_sysfs_pct = 0; >>> + limits.max_sysfs_pct = 100; >>> + >>> if (policy->policy == CPUFREQ_POLICY_PERFORMANCE && >>> policy->max >= policy->cpuinfo.max_freq) { >>> limits.min_policy_pct = 100; >>> @@ -1004,9 +1007,9 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) >>> limits.max_policy_pct = clamp_t(int, limits.max_policy_pct, 0 , 100); >>> >>> /* Normalize user input to [min_policy_pct, max_policy_pct] */ >>> - limits.min_perf_pct = max(limits.min_policy_pct, limits.min_sysfs_pct); >>> + limits.min_perf_pct = limits.min_policy_pct; >>> limits.min_perf_pct = min(limits.max_policy_pct, limits.min_perf_pct); >>> - limits.max_perf_pct = min(limits.max_policy_pct, limits.max_sysfs_pct); >>> + limits.max_perf_pct = limits.max_sysfs_pct; > > On a second thought, isn't that always 100? If so, doesn't it basically discard > limits.max_policy_pct? Looking at it, yes. And that's definitely an unintended consequence of this patch :). I'll take a closer look. I thought it should be permissible to set a range of (min_perf_pct, max_perf_pct) while changing p-states and I thought the purpose of max_perf_pct was to set the higher percentage limit. P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists