lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 25 Aug 2020 08:06:09 -0700
From:   Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
To:     "Rafael J. Wysocki" <rafael@...nel.org>,
        Artem Bityutskiy <dedekind1@...il.com>
Cc:     "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Linux PM <linux-pm@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Doug Smythies <dsmythies@...us.net>
Subject: Re: [PATCH v2 2/5] cpufreq: intel_pstate: Always return last EPP
 value from sysfs

On Tue, 2020-08-25 at 16:51 +0200, Rafael J. Wysocki wrote:
> On Tue, Aug 25, 2020 at 8:20 AM Artem Bityutskiy <dedekind1@...il.com
> > wrote:
> > On Mon, 2020-08-24 at 19:42 +0200, Rafael J. Wysocki wrote:
> > > From: "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>
> > > 
> > > Make the energy_performance_preference policy attribute in sysfs
> > > always return the last EPP value written to it instead of the one
> > > currently in the HWP Request MSR to avoid possible confusion when
> > > the performance scaling algorithm is used in the active mode with
> > > HWP enabled (in which case the EPP is forced to 0 regardless of
> > > what value it has been set to via sysfs).
> > 
> > Why is this a good idea, I wonder. If there was a prior discussion,
> > please, point to it.
> > 
> > The general approach to changing settings via sysfs is often like
> > this:
> > 
> > 1. Write new value.
> > 2. Read it back and verify that it is the same. Because there is no
> > better way to verify that the kernel "accepted" the value.
> 
> If the write is successful (ie. no errors returned and the value
> returned is equal to the number of written characters), the kernel
> *has* accepted the written value, but it may not have taken effect.
> These are two different things.
> 
> The written value may take an effect immediately or it may take an
> effect later, depending on the current configuration etc.  If you
> don't see the effect of it immediately, it doesn't matter that there
> was a failure of some sort.
> 
> > Let's say I write 'balanced' to energy_performance_preference. I
> > read
> > it back, and it contains 'balanced', so I am happy, I trust the
> > kernel
> > changed EPP to "balanced".
> > 
> > If the kernel, in fact, uses something else, I want to know about
> > it
> > and have my script fail.
> 
> Why do you want it to fail then?
> 
> > Why caching the value and making my script _think_ it succeeded is
> > a good idea.
> 
> Because when you change the scaling algorithm or the driver's
> operation mode, the value you have written will take effect.
> 
> In this particular case it is explained in the driver documentation
> that the performance scaling algorithm in the active mode overrides
> the sysfs value and that's the only case when it can be overridden.
> So whatever you write to this attribute will not take effect
> immediately anyway, but it may take an effect later.

In some cases without even changing active/passive this is happening
when there was some error previously. For example:

#cat energy_performance_preference 
127
[root@...pl-perf-test-skx-i9 cpufreq]# rdmsr -p 1 0x774
8000ff00

I think we should show reality. In mode change can be a special case
and use the stored value to restore in new mode.

Thanks,
Srinivas

> > In other words, in my usage scenarios at list, I prefer kernel
> > telling
> > the true EPP value, not some "cached, but not used" value.
> 
> An alternative is to fail writes to energy_performance_preference if
> the driver works in the active mode and the scaling algorithm for the
> scaling CPU is performance and *then* to make reads from it return
> the
> value in the register.
> 
> Accepting a write and returning a different value in a subsequent
> read
> is confusing.
> 
> Thanks!

Powered by blists - more mailing lists