lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 25 Oct 2022 15:35:04 +0800
From:   Feng Tang <feng.tang@...el.com>
To:     Dave Hansen <dave.hansen@...el.com>
CC:     Zhang Rui <rui.zhang@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H . Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>, <x86@...nel.org>,
        <linux-kernel@...r.kernel.org>, <tim.c.chen@...el.com>,
        Xiongfeng Wang <wangxiongfeng2@...wei.com>,
        <liaoyu15@...wei.com>
Subject: Re: [PATCH v1 1/2] x86/tsc: use logical_package as a better
 estimation of socket numbers

On Mon, Oct 24, 2022 at 08:42:30AM -0700, Dave Hansen wrote:
> On 10/22/22 09:12, Zhang Rui wrote:
> >>> I'm not sure if we have a perfect solution here.
> >> Are the implementations fixable?
> > currently, I don't have any idea.
> > 
> >>   Or, at least tolerable?
> 
> That would be great to figure out before we start throwing more patches
> around.

Yes, agreed!

> >> For instance, I can live with the implementation being a bit goofy
> >> when
> >> kernel commandlines are in play.  We can pr_info() about those cases.
> > My understanding is that the cpus in the last package may still have
> > small cpu id value. This means that the 'logical_packages' is hard to
> > break unless we boot with very small CPU count and happened to disable
> > all cpus in one/more packages. Feng is experiencing with this and may
> > have some update later.
> > 
> > If this is the case, is this a valid case that we need to take care of?
> 
> Well, let's talk through it a bit.
> 
> What is the triggering event and what's the fallout?

In worst case (2 sockets), if the maxcpus falls to '<= total_cpus/2',
the 'logical_packages' will be less than the real number.

> Is the user on a truly TSC stable system or not?
> 
> What kind of maxcpus= argument do they need to specify?  Is it something
> that's likely to get used in production or is it most likely just for
> debugging?

IIUC, for the server side, it's most likely for debug use. And for
clients, socket number is not an issue.

> What is the maxcpus= fallout?  Does it over estimate or under estimate
> the number of logical packages?
 
Only under estimate.

> How many cases outside of maxcpus= do we know of that lead to an
> imprecise "logical packages" calculation?
 
Thanks to you, Peter and Rui's info, we have listed a bunch of
user cases than 'maxcpus', and they won't lead to imprecise
'logical_packages'. And I'm not sure if there is other case which
hasn't poped up.

> Does this lead to the TSC being mistakenly marked stable when it is not,
> or *not* being marked stable when it is?

Only the former case 'mistakenly marked stable' is possible, say we
use 'maxcpus=8' on a 192 core 8 sockets machine.

> Let's get all of that info in one place and make sure we are all agreed
> on the *problem* before we got to the solution space.

OK.

Thanks,
Feng



Powered by blists - more mailing lists