lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <IA1PR11MB767796138882E4BCF15FEA20BBEFA@IA1PR11MB7677.namprd11.prod.outlook.com>
Date:   Wed, 6 Sep 2023 19:07:56 +0000
From:   "Pawnikar, Sumeet R" <sumeet.r.pawnikar@...el.com>
To:     srinivas pandruvada <srinivas.pandruvada@...ux.intel.com>,
        "Zhang, Rui" <rui.zhang@...el.com>,
        "rafael@...nel.org" <rafael@...nel.org>
CC:     "linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: RE: [PATCH] powercap: intel_rapl: Fix setting of Power Limit 4 to 0



> -----Original Message-----
> From: srinivas pandruvada <srinivas.pandruvada@...ux.intel.com>
> Sent: Wednesday, September 6, 2023 8:06 PM
> To: Zhang, Rui <rui.zhang@...el.com>; rafael@...nel.org
> Cc: linux-pm@...r.kernel.org; Pawnikar, Sumeet R
> <sumeet.r.pawnikar@...el.com>; linux-kernel@...r.kernel.org;
> stable@...r.kernel.org
> Subject: Re: [PATCH] powercap: intel_rapl: Fix setting of Power Limit 4 to 0
> 
> Hi Rui,
> 
> On Wed, 2023-09-06 at 02:20 +0000, Zhang, Rui wrote:
> > Hi, Srinivas,
> >
> > Thanks for the patch.
> >
> > On Tue, 2023-09-05 at 17:06 -0700, Srinivas Pandruvada wrote:
> > > System runs at minimum performance, once powercap RAPL package
> > > domain "enabled" flag is toggled.
> > >
> > > Setting RAPL package domain enabled flag to 0, results in setting of
> > > power limit 4 (PL4) MSR 0x601 to 0.
> >
> > This is the bug. Setting enabled flag should only affect the Enable
> > bit but not the other bits.
> >
> > >  This implies disabling PL4 limit.
> > > The PL4 limit controls the peak power. This can significantly change
> > > the performance. Even worse, when the enabled flag is set to 1
> > > again.
> > > This will set PL4 MSR value to 0x01, which means reduce peak power
> > > to 0.125W. This will force the system to run at the lowest possible
> > > performance.
> > >
> > > This is caused by a change which assumes that there is an enable bit
> > > in the PL4 MSR like other power limits.
> >
> > MSR RAPL doesn't have PL4 enable bit, but TPMI RAPL does have per
> > power limit enable bit.
> That is correct, but not sure that in practice used or not.
> 
> >
> > >
> > > In functions set_floor_freq_default() and rapl_remove_package(),
> > > call
> > > rapl_write_pl_data with PL_ENABLE and PL_CLAMP for only power limit
> > > 1
> > > and 2. Similarly don't read PL_ENABLE for PL4 to check the presence
> > > of
> > > power limit 4.
> >
> > IMO, the rootcause is that we expose an non-exits PL4_ENABLE
> > primitive
> > for MSR Interface, and even worse we expose it wrongly that the
> > primitive uses the power limit bits.
> >
> > Commit 9050a9cd5e4c ("powercap: intel_rapl: Cleanup Power Limits
> > support") pokes the MSR interface PL4_ENABLE primitive and exposes
> > this
> > bug.
> >
> > Sumeet is testing the below fix right now, and I suppose he will give
> > some update soon.
> > 

Testing still going on. 

> > thanks,
> > rui
> >
> > diff --git a/drivers/powercap/intel_rapl_common.c
> > b/drivers/powercap/intel_rapl_common.c
> > index 8fac57b28f8a..d407f918876f 100644
> > --- a/drivers/powercap/intel_rapl_common.c
> > +++ b/drivers/powercap/intel_rapl_common.c
> > @@ -658,8 +658,6 @@ static struct rapl_primitive_info
> > rpi_msr[NR_RAPL_PRIMITIVES] = {
> >                             RAPL_DOMAIN_REG_LIMIT, ARBITRARY_UNIT,
> > 0),
> >         [PL2_CLAMP] = PRIMITIVE_INFO_INIT(PL2_CLAMP,
> > POWER_LIMIT2_CLAMP, 48,
> >                             RAPL_DOMAIN_REG_LIMIT, ARBITRARY_UNIT,
> > 0),
> > -       [PL4_ENABLE] = PRIMITIVE_INFO_INIT(PL4_ENABLE,
> > POWER_LIMIT4_MASK, 0,
> > -                               RAPL_DOMAIN_REG_PL4, ARBITRARY_UNIT,
> > 0),
> >         [TIME_WINDOW1] = PRIMITIVE_INFO_INIT(TIME_WINDOW1,
> > TIME_WINDOW1_MASK, 17,
> >                             RAPL_DOMAIN_REG_LIMIT, TIME_UNIT, 0),
> >         [TIME_WINDOW2] = PRIMITIVE_INFO_INIT(TIME_WINDOW2,
> > TIME_WINDOW2_MASK, 49,
> >
> >
> Let me try this, but this is not enough. The enable/disable is also
> gets checked for presence of PL4.
> 

Yes, we need the check as well.

Thanks,
Sumeet. 

> Thanks,
> Srinivas
> 
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ