lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dd301065-a9ec-0918-daa4-596245baae00@linux.intel.com>
Date:   Tue, 27 Jun 2023 17:57:36 -0700
From:   Yang Jie <yang.jie@...ux.intel.com>
To:     Thomas Gleixner <tglx@...utronix.de>, x86@...nel.org,
        linux-kernel@...r.kernel.org
Cc:     Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "H . Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Yair Podemsky <ypodemsk@...hat.com>, linux-pm@...r.kernel.org,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Doug Smythies <dsmythies@...us.net>
Subject: Re: [PATCH] x86/aperfmperf: Fix the fallback condition in
 arch_freq_get_on_cpu()


Sorry for top posting, I should have sent it to linux-pm and maintainers.

Doug Smythies had a good discussion with me about the related history 
and issues in the bugzilla here: 
https://bugzilla.kernel.org/show_bug.cgi?id=217597.

Basically, there are 2 issues here per my observation:
1. the cpu_khz shared for all CPU cores? In Intel's recent Hybrid CPUs, 
what does this cpu_khz read from cpuid really mean? I am seeing 
cpu_khz=3.6GHz for E-cores with Max frequecy 3GHz. We should fix that, no?
2. We don't want to wake up cores just because of the sysfs queries, so 
we introduced fallback mechanism here, what is our clear design about that?

So, before discussing those issues, we should get alignment on these first:
1. What is fallback and When should we fallback. From the comment, looks 
we wanted to use cpu_khz for Cores haven't executed any task during the 
last 20ms, this sounds reasonable, and I patch here is to address this 
issue.
2. What frequencies should we show in fallback case. This could be 
controversial, 0? min_freq? base_freq? or last calculated one? Doug has 
suggestion here but this is not touched in my patch here.

Thanks,
~Keyon

On 6/26/23 12:36, Keyon Jie wrote:
>>>From the commit f3eca381bd49 on, the fallback condition about the 'the
> last update was too long' have been comparing ticks and milliseconds by
> mistake, which leads to that the condition is met and the fallback
> method is used frequently.
> 
> The change to compare ticks here corrects that and fixes related issues
> have been seen on x86 platforms since 5.18 kernel.
> 
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217597
> Fixes: f3eca381bd49 ("x86/aperfmperf: Replace arch_freq_get_on_cpu()")
> CC: Thomas Gleixner <tglx@...utronix.de>
> Signed-off-by: Keyon Jie <yang.jie@...ux.intel.com>
> ---
>   arch/x86/kernel/cpu/aperfmperf.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/aperfmperf.c b/arch/x86/kernel/cpu/aperfmperf.c
> index fdbb5f07448f..24e24e137226 100644
> --- a/arch/x86/kernel/cpu/aperfmperf.c
> +++ b/arch/x86/kernel/cpu/aperfmperf.c
> @@ -432,7 +432,7 @@ unsigned int arch_freq_get_on_cpu(int cpu)
>   	 * Bail on invalid count and when the last update was too long ago,
>   	 * which covers idle and NOHZ full CPUs.
>   	 */
> -	if (!mcnt || (jiffies - last) > MAX_SAMPLE_AGE)
> +	if (!mcnt || (jiffies - last) > MAX_SAMPLE_AGE * cpu_khz)
>   		goto fallback;
>   
>   	return div64_u64((cpu_khz * acnt), mcnt);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ