linux-kernel - Re: [PATCH] x86/aperfmperf: Fix arch_scale_freq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5c6491f72aa134c0470fea2c742ed14b198f051e.camel@redhat.com>
Date:   Tue, 06 Sep 2022 15:16:50 +0300
From:   ypodemsk@...hat.com
To:     x86@...nel.org, tglx@...utronix.de, mingo@...hat.com,
        peterz@...radead.org, rafael.j.wysocki@...el.com, pauld@...hat.com,
        frederic@...nel.org, ggherdovich@...e.cz,
        linux-kernel@...r.kernel.org, lenb@...nel.org, vschneid@...hat.com,
        jlelli@...hat.com, mtosatti@...hat.com, ppandit@...hat.com,
        alougovs@...hat.com, lcapitul@...hat.com, nsaenz@...nel.org
Subject: Re: [PATCH] x86/aperfmperf: Fix arch_scale_freq_tick() on tickless
 systems

Friendly ping?

On Thu, 2022-08-04 at 16:17 +0300, Yair Podemsky wrote:
> In order for the scheduler to be frequency invariant we measure the
> ratio between the maximum cpu frequency and the actual cpu frequency.
> During long tickless periods of time the calculations that keep track
> of that might overflow, in the function scale_freq_tick():
> 
> if (check_shl_overflow(acnt, 2*SCHED_CAPACITY_SHIFT, &acnt))
> »       goto error;
> 
> eventually forcing the kernel to disable the feature with the
> message "Scheduler frequency invariance went wobbly, disabling!".
> Let's avoid that by detecting long tickless periods and bypassing
> the calculation for that tick.
> 
> This calculation updates the value of arch_freq_scale, used by the
> capacity-aware scheduler to correct cpu duty cycles:
> task_util_freq_inv(p) = duty_cycle(p) * (curr_frequency(cpu) /
> max_frequency(cpu))
> 
> However Consider a long tickless period, It takes should take 60
> minutes
> for a tickless CPU running at 5GHz to trigger the acnt overflow,
> pick 10 minutes as a staleness threshold to be on the safe side,
> In our testing it took over 30 minutes for the overflow to happen,
> but since it's frequency/platform dependent we choose a smaller value
> to be on the safe side.
> 
> Fixes: e2b0d619b400 ("x86, sched: check for counters overflow in
> frequency invariant accounting")
> Signed-off-by: Yair Podemsky <ypodemsk@...hat.com>
> ---
>  arch/x86/kernel/cpu/aperfmperf.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/aperfmperf.c
> b/arch/x86/kernel/cpu/aperfmperf.c
> index 1f60a2b27936..dfe356034a60 100644
> --- a/arch/x86/kernel/cpu/aperfmperf.c
> +++ b/arch/x86/kernel/cpu/aperfmperf.c
> @@ -23,6 +23,13 @@
>  
>  #include "cpu.h"
>  
> +/*
> + * Samples older then 10 minutes should not be proccessed,
> + * This time is long enough to prevent unneeded drops of data
> + * But short enough to prevent overflows
> + */
> +#define MAX_SAMPLE_AGE_NOHZ	((unsigned long)HZ * 600)
> +
>  struct aperfmperf {
>  	seqcount_t	seq;
>  	unsigned long	last_update;
> @@ -373,6 +380,7 @@ static inline void scale_freq_tick(u64 acnt, u64
> mcnt) { }
>  void arch_scale_freq_tick(void)
>  {
>  	struct aperfmperf *s = this_cpu_ptr(&cpu_samples);
> +	unsigned long last  = s->last_update;
>  	u64 acnt, mcnt, aperf, mperf;
>  
>  	if (!cpu_feature_enabled(X86_FEATURE_APERFMPERF))
> @@ -392,7 +400,12 @@ void arch_scale_freq_tick(void)
>  	s->mcnt = mcnt;
>  	raw_write_seqcount_end(&s->seq);
>  
> -	scale_freq_tick(acnt, mcnt);
> +	/*
> +	 * Avoid calling scale_freq_tick() when the last update was too
> long ago,
> +	 * as it might overflow during calulation.
> +	 */
> +	if ((jiffies - last) <= MAX_SAMPLE_AGE_NOHZ)
> +		scale_freq_tick(acnt, mcnt);
>  }
>  
>  /*