lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1587575853.9537.117.camel@suse.cz>
Date:   Wed, 22 Apr 2020 19:17:33 +0200
From:   Giovanni Gherdovich <ggherdovich@...e.cz>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Ingo Molnar <mingo@...hat.com>,
        "Rafael J . Wysocki" <rjw@...ysocki.net>, x86@...nel.org,
        linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Borislav Petkov <bp@...e.de>, Len Brown <lenb@...nel.org>
Subject: Re: [PATCH] x86, sched: Prevent divisions by zero in frequency
 invariant accounting

On Wed, 2020-04-22 at 16:53 +0200, Peter Zijlstra wrote:
> On Wed, Apr 22, 2020 at 04:40:55PM +0200, Giovanni Gherdovich wrote:
> > The product mcnt * arch_max_freq_ratio could be zero if it overflows u64.
> > 
> > For context, a large value for arch_max_freq_ratio would be 5000,
> > corresponding to a turbo_freq/base_freq ratio of 5 (normally it's more like
> > 1500-2000). A large increment frequency for the MPERF counter would be 5GHz
> > (the base clock of all CPUs on the market today is less than that). With
> > these figures, a CPU would need to go without a scheduler tick for around 8
> > days for the u64 overflow to happen. It is unlikely, but the check is
> > warranted.
> > 
> > Signed-off-by: Giovanni Gherdovich <ggherdovich@...e.cz>
> > Fixes: 1567c3e3467c ("x86, sched: Add support for frequency invariance")
> > ---
> >  arch/x86/kernel/smpboot.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> > index 8c89e4d9ad28..fb71395cbcad 100644
> > --- a/arch/x86/kernel/smpboot.c
> > +++ b/arch/x86/kernel/smpboot.c
> > @@ -2055,14 +2055,14 @@ void arch_scale_freq_tick(void)
> >  
> >  	acnt = aperf - this_cpu_read(arch_prev_aperf);
> >  	mcnt = mperf - this_cpu_read(arch_prev_mperf);
> > -	if (!mcnt)
> > -		return;
> >  
> >  	this_cpu_write(arch_prev_aperf, aperf);
> >  	this_cpu_write(arch_prev_mperf, mperf);
> >  
> >  	acnt <<= 2*SCHED_CAPACITY_SHIFT;
> >  	mcnt *= arch_max_freq_ratio;
> > +	if (!mcnt)
> > +		return;
> 
> Should we not pr_warn() and disable the whole thing when this happens?

Ok, I will resend this patch disabling freq invariant accounting when this
overflow happens.

To elaborate further, your comment touches on an area where x86 freq
invariance is very weak at the moment: what happens if the tick doesn't run on
a cpu for a long time (answer: the estimation of freq_scale is garbage).
And by "a long time" I mean a few seconds; the patch I'm about to resend only
covers a minuscule fraction of those cases. That is, not only the tick has
been missing for days (?!), but we only noticed because the product
mcnt * arch_max_freq_ratio gave exactly 2^64 (aka 0). It could have been
waiting for 1 more millis and we wouldn't have seen the issue.

Anyways I agree on the principle: even if we can't address all problems now,
let's at least cover those where the solution is easy.


Giovanni

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ