[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dfd2fb43-2a19-545a-fea8-f793a685ef30@intel.com>
Date: Mon, 24 Oct 2022 08:42:30 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: Zhang Rui <rui.zhang@...el.com>, Feng Tang <feng.tang@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H . Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>, x86@...nel.org,
linux-kernel@...r.kernel.org
Cc: tim.c.chen@...el.com, Xiongfeng Wang <wangxiongfeng2@...wei.com>,
liaoyu15@...wei.com
Subject: Re: [PATCH v1 1/2] x86/tsc: use logical_package as a better
estimation of socket numbers
On 10/22/22 09:12, Zhang Rui wrote:
>>> I'm not sure if we have a perfect solution here.
>> Are the implementations fixable?
> currently, I don't have any idea.
>
>> Or, at least tolerable?
That would be great to figure out before we start throwing more patches
around.
>> For instance, I can live with the implementation being a bit goofy
>> when
>> kernel commandlines are in play. We can pr_info() about those cases.
> My understanding is that the cpus in the last package may still have
> small cpu id value. This means that the 'logical_packages' is hard to
> break unless we boot with very small CPU count and happened to disable
> all cpus in one/more packages. Feng is experiencing with this and may
> have some update later.
>
> If this is the case, is this a valid case that we need to take care of?
Well, let's talk through it a bit.
What is the triggering event and what's the fallout?
Is the user on a truly TSC stable system or not?
What kind of maxcpus= argument do they need to specify? Is it something
that's likely to get used in production or is it most likely just for
debugging?
What is the maxcpus= fallout? Does it over estimate or under estimate
the number of logical packages?
How many cases outside of maxcpus= do we know of that lead to an
imprecise "logical packages" calculation?
Does this lead to the TSC being mistakenly marked stable when it is not,
or *not* being marked stable when it is?
Let's get all of that info in one place and make sure we are all agreed
on the *problem* before we got to the solution space.
Powered by blists - more mailing lists