[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y1eRqOZIRYtC7ZAE@feng-clx>
Date: Tue, 25 Oct 2022 15:35:04 +0800
From: Feng Tang <feng.tang@...el.com>
To: Dave Hansen <dave.hansen@...el.com>
CC: Zhang Rui <rui.zhang@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H . Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>, <x86@...nel.org>,
<linux-kernel@...r.kernel.org>, <tim.c.chen@...el.com>,
Xiongfeng Wang <wangxiongfeng2@...wei.com>,
<liaoyu15@...wei.com>
Subject: Re: [PATCH v1 1/2] x86/tsc: use logical_package as a better
estimation of socket numbers
On Mon, Oct 24, 2022 at 08:42:30AM -0700, Dave Hansen wrote:
> On 10/22/22 09:12, Zhang Rui wrote:
> >>> I'm not sure if we have a perfect solution here.
> >> Are the implementations fixable?
> > currently, I don't have any idea.
> >
> >> Or, at least tolerable?
>
> That would be great to figure out before we start throwing more patches
> around.
Yes, agreed!
> >> For instance, I can live with the implementation being a bit goofy
> >> when
> >> kernel commandlines are in play. We can pr_info() about those cases.
> > My understanding is that the cpus in the last package may still have
> > small cpu id value. This means that the 'logical_packages' is hard to
> > break unless we boot with very small CPU count and happened to disable
> > all cpus in one/more packages. Feng is experiencing with this and may
> > have some update later.
> >
> > If this is the case, is this a valid case that we need to take care of?
>
> Well, let's talk through it a bit.
>
> What is the triggering event and what's the fallout?
In worst case (2 sockets), if the maxcpus falls to '<= total_cpus/2',
the 'logical_packages' will be less than the real number.
> Is the user on a truly TSC stable system or not?
>
> What kind of maxcpus= argument do they need to specify? Is it something
> that's likely to get used in production or is it most likely just for
> debugging?
IIUC, for the server side, it's most likely for debug use. And for
clients, socket number is not an issue.
> What is the maxcpus= fallout? Does it over estimate or under estimate
> the number of logical packages?
Only under estimate.
> How many cases outside of maxcpus= do we know of that lead to an
> imprecise "logical packages" calculation?
Thanks to you, Peter and Rui's info, we have listed a bunch of
user cases than 'maxcpus', and they won't lead to imprecise
'logical_packages'. And I'm not sure if there is other case which
hasn't poped up.
> Does this lead to the TSC being mistakenly marked stable when it is not,
> or *not* being marked stable when it is?
Only the former case 'mistakenly marked stable' is possible, say we
use 'maxcpus=8' on a 192 core 8 sockets machine.
> Let's get all of that info in one place and make sure we are all agreed
> on the *problem* before we got to the solution space.
OK.
Thanks,
Feng
Powered by blists - more mailing lists