[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8fc7d9236bfa968b5f202d458e99e2e729fed343.camel@intel.com>
Date: Wed, 6 Sep 2023 02:20:34 +0000
From: "Zhang, Rui" <rui.zhang@...el.com>
To: "srinivas.pandruvada@...ux.intel.com"
<srinivas.pandruvada@...ux.intel.com>,
"rafael@...nel.org" <rafael@...nel.org>
CC: "linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
"Pawnikar, Sumeet R" <sumeet.r.pawnikar@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [PATCH] powercap: intel_rapl: Fix setting of Power Limit 4 to 0
Hi, Srinivas,
Thanks for the patch.
On Tue, 2023-09-05 at 17:06 -0700, Srinivas Pandruvada wrote:
> System runs at minimum performance, once powercap RAPL package domain
> "enabled" flag is toggled.
>
> Setting RAPL package domain enabled flag to 0, results in setting of
> power limit 4 (PL4) MSR 0x601 to 0.
This is the bug. Setting enabled flag should only affect the Enable bit
but not the other bits.
> This implies disabling PL4 limit.
> The PL4 limit controls the peak power. This can significantly change
> the performance. Even worse, when the enabled flag is set to 1 again.
> This will set PL4 MSR value to 0x01, which means reduce peak power to
> 0.125W. This will force the system to run at the lowest possible
> performance.
>
> This is caused by a change which assumes that there is an enable bit
> in the PL4 MSR like other power limits.
MSR RAPL doesn't have PL4 enable bit, but TPMI RAPL does have per power
limit enable bit.
>
> In functions set_floor_freq_default() and rapl_remove_package(), call
> rapl_write_pl_data with PL_ENABLE and PL_CLAMP for only power limit 1
> and 2. Similarly don't read PL_ENABLE for PL4 to check the presence
> of
> power limit 4.
IMO, the rootcause is that we expose an non-exits PL4_ENABLE primitive
for MSR Interface, and even worse we expose it wrongly that the
primitive uses the power limit bits.
Commit 9050a9cd5e4c ("powercap: intel_rapl: Cleanup Power Limits
support") pokes the MSR interface PL4_ENABLE primitive and exposes this
bug.
Sumeet is testing the below fix right now, and I suppose he will give
some update soon.
thanks,
rui
diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c
index 8fac57b28f8a..d407f918876f 100644
--- a/drivers/powercap/intel_rapl_common.c
+++ b/drivers/powercap/intel_rapl_common.c
@@ -658,8 +658,6 @@ static struct rapl_primitive_info rpi_msr[NR_RAPL_PRIMITIVES] = {
RAPL_DOMAIN_REG_LIMIT, ARBITRARY_UNIT, 0),
[PL2_CLAMP] = PRIMITIVE_INFO_INIT(PL2_CLAMP, POWER_LIMIT2_CLAMP, 48,
RAPL_DOMAIN_REG_LIMIT, ARBITRARY_UNIT, 0),
- [PL4_ENABLE] = PRIMITIVE_INFO_INIT(PL4_ENABLE, POWER_LIMIT4_MASK, 0,
- RAPL_DOMAIN_REG_PL4, ARBITRARY_UNIT, 0),
[TIME_WINDOW1] = PRIMITIVE_INFO_INIT(TIME_WINDOW1, TIME_WINDOW1_MASK, 17,
RAPL_DOMAIN_REG_LIMIT, TIME_UNIT, 0),
[TIME_WINDOW2] = PRIMITIVE_INFO_INIT(TIME_WINDOW2, TIME_WINDOW2_MASK, 49,
Powered by blists - more mailing lists