[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200626084903.GA27151@zn.tnic>
Date: Fri, 26 Jun 2020 10:49:03 +0200
From: Borislav Petkov <bp@...en8.de>
To: Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
Cc: rjw@...ysocki.net, viresh.kumar@...aro.org, lenb@...nel.org,
dsmythies@...us.net, tglx@...utronix.de, mingo@...hat.com,
hpa@...or.com, peterz@...radead.org, linux-pm@...r.kernel.org,
linux-kernel@...r.kernel.org, x86@...nel.org
Subject: Re: [UPDATE][PATCH v3 1/2] cpufreq: intel_pstate: Allow
enable/disable energy efficiency
On Thu, Jun 25, 2020 at 03:49:31PM -0700, Srinivas Pandruvada wrote:
> By default intel_pstate driver disables energy efficiency by setting
> MSR_IA32_POWER_CTL bit 19 for Kaby Lake desktop CPU model in HWP mode.
> This CPU model is also shared by Coffee Lake desktop CPUs. This allows
> these systems to reach maximum possible frequency. But this adds power
> penalty, which some customers don't want. They want some way to enable/
> disable dynamically.
>
> So, add an additional attribute "energy_efficiency_enable" under
> /sys/devices/system/cpu/intel_pstate/ for these CPU models. This allows
> to read and write bit 19 ("Disable Energy Efficiency Optimization") in
> the MSR IA32_POWER_CTL.
Yes, this is how functionality behind MSRs should be made available to
userspace - not poking at naked MSRs. Good.
> This attribute is present in both HWP and non-HWP mode as this has an
> effect in both modes. Refer to Intel Software Developer's manual for
> details. The scope of this bit is package wide. Also these systems
> support only one package. So read/write MSR on the current CPU is
> enough.
>
> Suggested-by: Len Brown <lenb@...nel.org>
> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
> ---
> v3 update
> Moved the MSR bit definition to msr-index.h from intel_pstate.c as Doug
> wanted. Offline checking with Borislav, for MSR defintion it is
> fine to move to msr-index.h even for single user of the definition. But
> here the MSR definition is already in msr-index.h, but adding the MSR bit
> definition also.
Yes.
Btw, no need for the "offline checking" - you can do this on the mailing
list just fine.
> Documentation/admin-guide/pm/intel_pstate.rst | 9 ++++
> arch/x86/include/asm/msr-index.h | 1 +
> drivers/cpufreq/intel_pstate.c | 47 ++++++++++++++++++-
> 3 files changed, 55 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/admin-guide/pm/intel_pstate.rst b/Documentation/admin-guide/pm/intel_pstate.rst
> index 39d80bc29ccd..1ca2684a94d7 100644
> --- a/Documentation/admin-guide/pm/intel_pstate.rst
> +++ b/Documentation/admin-guide/pm/intel_pstate.rst
> @@ -431,6 +431,15 @@ argument is passed to the kernel in the command line.
> supported in the current configuration, writes to this attribute will
> fail with an appropriate error.
>
> +``energy_efficiency_enable``
> + This attribute is only present on platforms, which has CPUs matching
which have
> + Kaby Lake or Coffee Lake desktop CPU model. By default
> + "energy_efficiency" is disabled on these CPU models in HWP mode by this
> + driver. Enabling energy efficiency may limit maximum operating
> + frequency in both HWP and non HWP mode. In non HWP mode, this attribute
> + has an effect in turbo range only. But in HWP mode, this attribute also
> + has an effect in non turbo range.
Those last two sentences could be simplified - read strange.
> +
> Interpretation of Policy Attributes
> -----------------------------------
>
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index e8370e64a155..fec86ad14f8d 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -254,6 +254,7 @@
> #define MSR_PEBS_FRONTEND 0x000003f7
>
> #define MSR_IA32_POWER_CTL 0x000001fc
> +#define MSR_IA32_POWER_CTL_BIT_EE 19
Sort that MSR in - I know, the rest is not sorted either but we can
start somewhere. So pls put it...
#define MSR_LBR_SELECT 0x000001c8
#define MSR_LBR_TOS 0x000001c9
<--- here.
#define MSR_LBR_NHM_FROM 0x00000680
#define MSR_LBR_NHM_TO 0x000006c0
> #define MSR_IA32_MC0_CTL 0x00000400
> #define MSR_IA32_MC0_STATUS 0x00000401
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index 8e23a698ce04..daa1d9c12098 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -1218,6 +1218,42 @@ static ssize_t store_hwp_dynamic_boost(struct kobject *a,
> return count;
> }
>
> +static ssize_t show_energy_efficiency_enable(struct kobject *kobj,
> + struct kobj_attribute *attr,
> + char *buf)
> +{
> + u64 power_ctl;
> + int enable;
> +
> + rdmsrl(MSR_IA32_POWER_CTL, power_ctl);
> + enable = (power_ctl & BIT(MSR_IA32_POWER_CTL_BIT_EE)) >> MSR_IA32_POWER_CTL_BIT_EE;
So you can simplify to:
enable = !!(power_ctl & BIT(MSR_IA32_POWER_CTL_BIT_EE));
methinks.
> + return sprintf(buf, "%d\n", !enable);
If this bit is called
"Disable Energy Efficiency Optimization"
why do you call your function and sysfs file "enable"? This is making it
more confusing.
Why don't you call it simply: "energy_efficiency" and have it intuitive:
1 - enabled
0 - disabled
?
> +static ssize_t store_energy_efficiency_enable(struct kobject *a,
> + struct kobj_attribute *b,
> + const char *buf, size_t count)
> +{
> + u64 power_ctl;
> + u32 input;
> + int ret;
> +
> + ret = kstrtouint(buf, 10, &input);
> + if (ret)
> + return ret;
> +
> + mutex_lock(&intel_pstate_driver_lock);
> + rdmsrl(MSR_IA32_POWER_CTL, power_ctl);
> + if (input)
This is too lax - it will be enabled for any !0 value. Please accept
only 0 and 1.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Powered by blists - more mailing lists