[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABgObfa_7ZAq1Kb9G=ehkzHfc5if3wnFi-kj3MZLE3oYLrArdQ@mail.gmail.com>
Date: Fri, 2 Feb 2024 00:08:36 +0100
From: Paolo Bonzini <pbonzini@...hat.com>
To: Dave Hansen <dave.hansen@...el.com>
Cc: linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
Zixi Chen <zixchen@...hat.com>, Adam Dunlap <acdunlap@...gle.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>, Xiaoyao Li <xiaoyao.li@...el.com>,
Kai Huang <kai.huang@...el.com>, Dave Hansen <dave.hansen@...ux.intel.com>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...nel.org>, x86@...nel.org,
stable@...r.kernel.org
Subject: Re: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME
On Thu, Feb 1, 2024 at 7:29 PM Dave Hansen <dave.hansen@...el.com> wrote:
> I really wanted get_cpu_address_sizes() to be the one and only spot
> where c->x86_phys_bits is established. That way, we don't get a bunch
> of code all of the place tweaking it and fighting for who "wins".
> We're not there yet, but the approach in this patch moves it back in the
> wrong direction because it permits the random tweaking of c->x86_phys_bits.
I see your point; one of my earlier attempts added a
->c_detect_mem_encrypt() callback that basically amounted to either
amd_detect_mem_encrypt() or detect_tme(). I bailed out of it mostly
because these functions do more than adjusting phys_bits, and it
seemed arbitrary to call them from get_cpu_address_sizes(). The two
approaches share the idea of centralizing the computation of
x86_phys_bits in get_cpu_address_sizes().
There is unfortunately an important hurdle for your patch, in that
currently the BSP and AP flows are completely different. For the BSP
the flow is ->c_early_init(), then get_cpu_address_sizes(), then again
->c_early_init() called by ->c_init(), then ->c_init(). For APs it is
get_cpu_address_sizes(), then ->c_early_init() called by ->c_init(),
then the rest of ->c_init(). And let's not even look at
->c_identify().
This difference is bad for your patch, because get_cpu_address_sizes()
is called too early to see enc_phys_bits on APs. But it was also
something that fbf6449f84bf didn't take into account, because it left
behind the tentative initialization of x86_*_bits in identify_cpu(),
while removing it from early_identify_cpu(). And
TBH my first reaction after Kirill pointed me to fbf6449f84bf was to
revert it. It's not like the code before was much less of a dumpster
fire, but that commit made the BSP vs. AP mess even worse. But it
fixed a real-world bug and it did remove most of the "first set then
adjust" logic, at least for the BSP, so a revert wasn't on the table
and patch 1 was what came out of it.
I know that in general in Linux we prefer to fix things for good.
Dancing one step behind and two ahead comes with the the risk that you
only do the step behind. But in the end something like this patch 1
would have to be posted for stable branches (and Greg doesn't like
one-off patches), and I am not even sure it's a step behind because it
removes _some_ of the BSP vs. AP differences introduced by
fbf6449f84bf.
In a nutshell: I don't dislike the idea behind your patch, but the
code is just not ready for it.
I can look into cleaning up cpu/common.c though. I'm a natural
procrastinator, it's my kind of thing to embark on a series of 30
patches to clean up 20 years old code...
Paolo
> Could we instead do something more like the (completely untested)
> attached patch?
>
> BTW, I'm pretty sure the WARN_ON() in the patch won't actually work, but
> it'd be nice to work toward a point when it does.
>
Powered by blists - more mailing lists