[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <65d26d679843e26fd5e6252a08391f87243a49c9.camel@intel.com>
Date: Tue, 3 Oct 2023 02:06:52 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>
CC: "mingo@...nel.org" <mingo@...nel.org>,
"Li, Xin3" <xin3.li@...el.com>,
"Compostella, Jeremy" <jeremy.compostella@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>, "bp@...en8.de" <bp@...en8.de>
Subject: Re: [PATCH v3 1/2] x86/cpu/intel: Fix MTRR verification for TME
enabled platforms
On Tue, 2023-10-03 at 01:47 +0300, kirill.shutemov@...ux.intel.com wrote:
> On Fri, Sep 29, 2023 at 09:14:00AM +0000, Huang, Kai wrote:
> > On Thu, 2023-09-28 at 15:30 -0700, Compostella, Jeremy wrote:
> > > On TME enabled platform, BIOS publishes MTRR taking into account Total
> > > Memory Encryption (TME) reserved bits.
> > >
> > > generic_get_mtrr() performs a sanity check of the MTRRs relying on the
> > > `phys_hi_rsvd' variable which is set using the cpuinfo_x86 structure
> > > `x86_phys_bits' field. But at the time the generic_get_mtrr()
> > > function is ran the `x86_phys_bits' has not been updated by
> > > detect_tme() when TME is enabled.
> > >
> > > Since the x86_phys_bits does not reflect yet the real maximal physical
> > > address size yet generic_get_mtrr() complains by logging the following
> > > messages.
> > >
> > > mtrr: your BIOS has configured an incorrect mask, fixing it.
> > > mtrr: your BIOS has configured an incorrect mask, fixing it.
> > > [...]
> > >
> > > In such a situation, generic_get_mtrr() returns an incorrect size but
> > > no side effect were observed during our testing.
> > >
> > > For `x86_phys_bits' to be updated before generic_get_mtrr() runs,
> > > move the detect_tme() call from init_intel() to early_init_intel().
> >
> > Hi,
> >
> > This move looks good to me, but +Kirill who is the author of detect_tme() for
> > further comments.
> >
> > Also I am not sure whether it's worth to consider to move this to
> > get_cpu_address_sizes(), which calculates the virtual/physical address sizes.
> > Thus it seems anything that can impact physical address size could be put there.
>
> Actually, I am not sure how this patch works. AFAICS after the patch we
> have the following callchain:
>
> early_identify_cpu()
> this_cpu->c_early_init() (which is early_init_init())
> detect_tme()
> c->x86_phys_bits -= keyid_bits;
> get_cpu_address_sizes(c);
> c->x86_phys_bits = eax & 0xff;
>
> Looks like get_cpu_address_sizes() would override what detect_tme() does.
After this patch, early_identify_cpu() calls get_cpu_address_sizes() first and
then calls c_early_init(), which calls detect_tme().
So looks no override. No?
>
> I guess we reach the same detect_tme() again via c->c_init() (aka
> init_intel()) codepath and get the value right again.
>
> But it seems accidental.
>
After this patch:
identify_cpu() ->
generic_identify() ->
get_cpu_address_sizes() <----- (1)
this_cpu->c_init() ->
early_init_intel() ->
detect_tme() <----- (2)
(1) resets x86_phys_bits [*], but (2) is called rather soon so nothing wrong is
done between them for now.
But there's a window between (1) and (2) (and similarly in early_identify_cpu()
too). Things can get wrong if people are careless, thus I said _perhaps_ it's
worth to consider to move detect_tme() to get_cpu_address_sizes(). But I don't
know.
[*]: on the other hand, (1) is necessary to make (2) right.
Powered by blists - more mailing lists