linux-kernel - Re: [PATCH v2 1/1] ACPI: CPPC: Disable FIE if registers in PCC regions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <64ba1dfb-a475-e667-b59d-57e5d1e5ff1f@arm.com>
Date:   Wed, 10 Aug 2022 15:32:26 +0100
From:   Lukasz Luba <lukasz.luba@....com>
To:     Jeremy Linton <jeremy.linton@....com>
Cc:     rafael@...nel.org, lenb@...nel.org, viresh.kumar@...aro.org,
        robert.moore@...el.com, devel@...ica.org,
        linux-acpi@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-pm@...r.kernel.org, vschneid@...hat.com,
        Ionela Voinescu <ionela.voinescu@....com>,
        Dietmar Eggemann <dietmar.eggemann@....com>
Subject: Re: [PATCH v2 1/1] ACPI: CPPC: Disable FIE if registers in PCC
 regions



On 8/10/22 15:08, Jeremy Linton wrote:
> Hi,
> 
> On 8/10/22 07:29, Lukasz Luba wrote:
>> Hi Jeremy,
>>
>> +CC Valentin since he might be interested in this finding
>> +CC Ionela, Dietmar
>>
>> I have a few comments for this patch.
>>
>>
>> On 7/28/22 23:10, Jeremy Linton wrote:
>>> PCC regions utilize a mailbox to set/retrieve register values used by
>>> the CPPC code. This is fine as long as the operations are
>>> infrequent. With the FIE code enabled though the overhead can range
>>> from 2-11% of system CPU overhead (ex: as measured by top) on Arm
>>> based machines.
>>>
>>> So, before enabling FIE assure none of the registers used by
>>> cppc_get_perf_ctrs() are in the PCC region. Furthermore lets also
>>> enable a module parameter which can also disable it at boot or module
>>> reload.
>>>
>>> Signed-off-by: Jeremy Linton <jeremy.linton@....com>
>>> ---
>>>   drivers/acpi/cppc_acpi.c       | 41 ++++++++++++++++++++++++++++++++++
>>>   drivers/cpufreq/cppc_cpufreq.c | 19 ++++++++++++----
>>>   include/acpi/cppc_acpi.h       |  5 +++++
>>>   3 files changed, 61 insertions(+), 4 deletions(-)
>>
>>
>> 1. You assume that all platforms would have this big overhead when
>>     they have the PCC regions for this purpose.
>>     Do we know which version of HW mailbox have been implemented
>>     and used that have this 2-11% overhead in a platform?
>>     Do also more recent MHU have such issues, so we could block
>>     them by default (like in your code)?
> 
> Well, the mailbox nature of PCC pretty much assures its "slow", relative 
> the alternative of providing an actual register.  If a platform provides 
> direct access to say MHU registers, then of course they won't actually 
> be in a PCC region and the FIE will remain on.
> 
> 
>>
>> 2. I would prefer to simply change the default Kconfig value to 'n' for
>>     the ACPI_CPPC_CPUFREQ_FIE, instead of creating a runtime
>>     check code which disables it.
>>     We have probably introduce this overhead for older platforms with
>>     this commit:
> 
> The problem here is that these ACPI kernels are being shipped as single 
> images in distro's which expect them to run on a wide range of platforms 
> (including x86/amd in this case), and preform optimally on all of them.
> 
> So the 'n' option basically is saying that the latest FIE code doesn't 
> provide a befit anywhere?

How we define the 'benefit' here - it's a better task utilization.
How much better it would be vs. previous approach with old-style FIE?

TBH, I haven't found any test results from the development of the patch
set. Maybe someone could point me to the test results which bring
this benefit of better utilization.

In the RFC I could find that statement [1]:

"This is tested with some hacks, as I didn't have access to the right
hardware, on the ARM64 hikey board to check the overall functionality
and that works fine."

There should be a rule that such code is tested on a real server with
many CPUs under some stress-test.

Ionela do you have some test results where this new FIE feature
introduces some better & meaningful accuracy improvement to the
tasks utilization?

With this overhead measured on a real server platform I think
it's not worth to keep it 'y' in default.

The design is heavy, as stated in the commit message:
"    On an invocation of cppc_scale_freq_tick(), we schedule an irq work
     (since we reach here from hard-irq context), which then schedules a
     normal work item and cppc_scale_freq_workfn() updates the per_cpu
     arch_freq_scale variable based on the counter updates since the last
     tick.
"

As you said Jeremy, this mailbox would always be with overhead. IMO
untill we cannot be sure we have some powerful new HW mailbox, this
feature should be disabled.

[1] 
https://lore.kernel.org/lkml/cover.1594289009.git.viresh.kumar@linaro.org/