linux-kernel - Re: [PATCH] cpufreq: intel_pstate: Optimize IO boost in non HWP mode

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87in3j9s07.fsf@riseup.net>
Date:   Wed, 05 Sep 2018 21:20:08 -0700
From:   Francisco Jerez <currojerez@...eup.net>
To:     Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
        Eero Tamminen <eero.t.tamminen@...el.com>, lenb@...nel.org,
        rjw@...ysocki.net, viresh.kumar@...aro.org
Cc:     mgorman@...hsingularity.net, ggherdovich@...e.cz,
        peterz@...radead.org, linux-pm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] cpufreq: intel_pstate: Optimize IO boost in non HWP mode

Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com> writes:

> [...]
>
>> > > 
>> > > This patch causes a number of statistically significant
>> > > regressions
>> > > (with significance of 1%) on the two systems I've tested it
>> > > on.  On
>> > > my
>> > 
>> > Sure. These patches are targeted to Atom clients where some of
>> > these
>> > server like workload may have some minor regression on few watts
>> > TDP
>> > parts.
>> 
>> Neither the 36% regression of fs-mark, the 21% regression of sqlite,
>> nor
>> the 10% regression of warsaw qualify as small.  And most of the test
>> cases on the list of regressions aren't exclusively server-like, if
>> at
>> all.  Warsaw, gtkperf, jxrendermark and lightsmark are all graphics
>> benchmarks -- Latency is as important if not more for interactive
>> workloads than it is for server workloads.  In the case of a conflict
>> like the one we're dealing with right now between optimizing for
>> throughput (e.g. for the maximum number of requests per second) and
>> optimizing for latency (e.g. for the minimum request duration), you
>> are
>> more likely to be concerned about the former than about the latter in
>> a
>> server setup.
>
> Eero,
> Please add your test results here.
>
> No matter which algorithm you do, there will be variations. So you have
> to look at the platforms which you are targeting. For this platform 
> number one item is use of less turbo and hope you know why?

Unfortunately the current controller uses turbo frequently on Atoms for
TDP-limited graphics workloads regardless of IOWAIT boosting.  IOWAIT
boosting simply exacerbated the pre-existing energy efficiency problem.

> On this platform GFX patch caused this issue as it was submitted after
> io boost patchset. So rather that should be reverted first before
> optimizing anything.
>
>
>
>> 
>> > But weighing against reduced TURBO usage (which is enough trigger)
>> > and
>> > improvement in tests done by Eero (which was primary complaint to
>> > us).
>> > 
>> > It is trivial patch. Yes, the patch is not perfect and doesn't
>> > close
>> > doors for any improvements.
>> > 
>> 
>> It's sort of self-contained because it's awfully incomplete.Don't
>> agtr
>
>> 
>> > I see your idea, but how to implement in acceptable way is a
>> > challenge.
>> 
>> Main challenge was getting the code to work without regressions in
>> latency-sensitive workloads, which I did, because you told me that it
>> wasn't acceptable for it to cause any regressions on the Phoronix
>> daily-system-tracker, whether latency-bound or not.
> Yes, because your intention was to have a global change not a low power
> Atom specific,
>

My intention was to target low-power Atoms only since the first day we
discussed this problem.  The cover letter of the series I sent and the
commit messages make this fairly clear.

>>   What is left in
>> order to address Peter's concerns is for the most part plumbing,
>> that's
>> guaranteed not to have any functional impact on the heuristic.  The
>> fact
>> that we don't expect it to change the performance of the system makes
>> it
>> substantially less time-consuming to validate than altering the
>> performance trade-offs of the heuristic dynamically in order to avoid
>> regressions (which is what has kept my systems busy most of the time
>> lately).  Seems like my series, even in its current state without the
>> changes requested by Peter is closer to being ready for production
>> than
>> this patch is.
>
> Sorry, Not at all. We call such patches as experimental series.

The numbers speak otherwise.

> You caused 100% regression to idle power. There is no version 2 after
> that, even if you fixed locally even to look.
>

I did post a link to a v2 that fixed the idle power issue on the v1
thread, but I didn't intend to send it for review yet.  I'll send it out
once I've fully taken into account Peter's feedback.

> Thanks,
> Srinivas


Download attachment "signature.asc" of type "application/pgp-signature" (228 bytes)