lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1587629164.28094.11.camel@suse.cz>
Date:   Thu, 23 Apr 2020 10:06:04 +0200
From:   Giovanni Gherdovich <ggherdovich@...e.cz>
To:     Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
Cc:     Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Borislav Petkov <bp@...e.de>, Len Brown <lenb@...nel.org>,
        "Rafael J . Wysocki" <rjw@...ysocki.net>, x86@...nel.org,
        linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org,
        Mel Gorman <mgorman@...hsingularity.net>,
        Doug Smythies <dsmythies@...us.net>,
        Like Xu <like.xu@...ux.intel.com>,
        Neil Rickert <nwr10cst-oslnx@...oo.com>,
        Chris Wilson <chris@...is-wilson.co.uk>
Subject: Re: [PATCH 1/4] x86, sched: Bail out of frequency invariance if
 base frequency is unknown

On Wed, 2020-04-22 at 10:15 -0700, Ricardo Neri wrote:
> On Thu, Apr 16, 2020 at 07:47:42AM +0200, Giovanni Gherdovich wrote:
> > Some hypervisors such as VMWare ESXi 5.5 advertise support for
> > X86_FEATURE_APERFMPERF but then fill all MSR's with zeroes. In particular,
> > MSR_PLATFORM_INFO set to zero tricks the code that wants to know the base
> > clock frequency of the CPU (highest non-turbo frequency), producing a
> > division by zero when computing the ratio turbo_freq/base_freq necessary
> > for frequency invariant accounting.
> > 
> > It is to be noted that even if MSR_PLATFORM_INFO contained the appropriate
> > data, APERF and MPERF are constantly zero on ESXi 5.5, thus freq-invariance
> > couldn't be done in principle (not that it would make a lot of sense in a
> > VM anyway). The real problem is advertising X86_FEATURE_APERFMPERF. This
> > appears to be fixed in more recent versions: ESXi 6.7 doesn't advertise
> > that feature.
> > 
> > Signed-off-by: Giovanni Gherdovich <ggherdovich@...e.cz>
> > Fixes: 1567c3e3467c ("x86, sched: Add support for frequency invariance")
> > ---
> >  arch/x86/kernel/smpboot.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> > index fe3ab9632f3b..3a318ec9bc17 100644
> > --- a/arch/x86/kernel/smpboot.c
> > +++ b/arch/x86/kernel/smpboot.c
> > @@ -1985,6 +1985,15 @@ static bool intel_set_max_freq_ratio(void)
> >  	return false;
> >  
> >  out:
> > +	/*
> > +	 * Some hypervisors advertise X86_FEATURE_APERFMPERF
> > +	 * but then fill all MSR's with zeroes.
> > +	 */
> > +	if (!base_freq) {
> > +		pr_debug("Couldn't determine cpu base frequency, necessary for scale-invariant accounting.\n");
> > +		return false;
> > +	}
> 
> It may be possible that MSR_TURBO_RATIO_LIMIT is also all-zeros. In
> such case, turbo_freq will be also zero. If that is the case,
> arch_max_freq_ratio will be zero and we will see a division by zero
> exception in arch_scale_freq_tick() because mcnt is multiplied by
> arch_max_freq_ratio().

Thanks Ricardo for clarifying this.

Follow-up question: when I see an all-zeros MSR_TURBO_RATIO_LIMIT, can I
assume the CPU doesn't support turbo boost? Or is it possible that such a CPU
has turbo boost, just the turbo ratios aren't declared in the MSR?

Some context: this feature (called "frequency invariance") wants to know
what's the max clock freq a CPU can have at any time (it needs it for some
scheduler calculations). This is hard to know precisely, because turbo can
kick in at any time and depends on many factors.  So it settles for an
"average maximum frequency", which I decided the 4 cores turbo is a good
estimate for. Now, if an all-zeros MSR_TURBO_RATIO_LIMIT means "turbo boost
unsupported", this is actually the easy case because then I know exactly what
the max freq is (base frequency). If, on the other hand, an all-zeros MSR
means "there may or may not be turbo, and you don't know how much" then I must
disable frequency invariance.


Thanks,
Giovanni

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ