linux-kernel - Re: [PATCH v3 0/3] Add support for AArch64 AMUv1-based arch_freq_get_on

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5bdlm4kzni6x2bdy7kmmomf7cmyohjhr4nr7v2mb2pchuhkulj@moakmpptnbg5>
Date: Mon, 25 Mar 2024 09:10:26 -0700
From: Vanshidhar Konda <vanshikonda@...amperecomputing.com>
To: Beata Michalska <beata.michalska@....com>
Cc: linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org, 
	ionela.voinescu@....com, sudeep.holla@....com, will@...nel.org, catalin.marinas@....com, 
	vincent.guittot@...aro.org, sumitg@...dia.com, yang@...amperecomputing.com, 
	lihuisong@...wei.com
Subject: Re: [PATCH v3 0/3] Add support for AArch64 AMUv1-based
 arch_freq_get_on_cpu

On Tue, Mar 12, 2024 at 08:34:28AM +0000, Beata Michalska wrote:
>Introducing arm64 specific version of arch_freq_get_on_cpu, cashing on
>existing implementation for FIE and AMUv1 support: the frequency scale
>factor, updated on each sched tick, serves as a base for retrieving
>the frequency for a given CPU, representing an average frequency
>reported between the ticks - thus its accuracy is limited.
>
>The changes have been rather lightly (due to some limitations) tested on
>an FVP model.
>

I tested these changes on an Ampere system. The results from reading
scaling_cur_freq look reasonable in the majority of cases I tested. I
only saw some unexpected behavior with cores that were configured for
no_hz full.

I observed the unexplained behavior when I tested as follows:
1. Run stress on all cores
    stress-ng --cpu 186 --timeout 10m --metrics-brief
2. Observe scaling_cur_freq and cpuinfo_cur_freq for all cores
    scaling_cur_freq values were within a few % of cpuinfo_cur_freq
3. Kill stress test
4. Observe scaling_cur_freq and cpuinfo_cur_freq for all cores
    scaling_cur_freq values were within a few % of cpuinfo_cur_freq for
    most cores except the ones configured with no_hz full.

no_hz full = 122-127
core   scaling_cur_freq  cpuinfo_cur_freq
[122]: 2997070           1000000
[123]: 2997070           1000000
[124]: 3000038           1000000
[125]: 2997070           1000000
[126]: 2997070           1000000
[127]: 2997070           1000000

These values were reflected for multiple seconds. I suspect the cores
entered WFI and there was no update to the scale while those cores were
idle.

Thanks,
Vanshi