[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZmsbZCF9rFzuB3rO@swahl-home.5wahls.com>
Date: Thu, 13 Jun 2024 11:16:36 -0500
From: Steve Wahl <steve.wahl@....com>
To: Borislav Petkov <bp@...en8.de>
Cc: Steve Wahl <steve.wahl@....com>, Ashish Kalra <ashish.kalra@....com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Andy Lutomirski <luto@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
x86@...nel.org, "H. Peter Anvin" <hpa@...or.com>,
linux-kernel@...r.kernel.org, Pavin Joseph <me@...injoseph.com>,
Eric Hagberg <ehagberg@...il.com>, Simon Horman <horms@...ge.net.au>,
Eric Biederman <ebiederm@...ssion.com>, Dave Young <dyoung@...hat.com>,
Sarah Brofeldt <srhb@....dk>, Russ Anderson <rja@....com>,
Dimitri Sivanich <sivanich@....com>,
Hou Wenlong <houwenlong.hwl@...group.com>,
Andrew Morton <akpm@...ux-foundation.org>, Baoquan He <bhe@...hat.com>,
Yuntao Wang <ytcoode@...il.com>, Bjorn Helgaas <bhelgaas@...gle.com>,
Joerg Roedel <jroedel@...e.de>, Michael Roth <michael.roth@....com>
Subject: Re: [PATCH 0/3] Resolve problems with kexec identity mapping
On Thu, Jun 13, 2024 at 05:28:56PM +0200, Borislav Petkov wrote:
Thank you for at least saying something on this!
> On Mon, May 20, 2024 at 01:36:30PM -0500, Steve Wahl wrote:
> > Although there was a previous fix to avoid early kernel access to the
> > EFI config table on Intel systems, the problem can still exist on AMD
> > systems that support SEV (Secure Encrypted Virtualization). The
> > command line option "nogbpages" brings this bug to the surface. And
> > this is what caused the regression with my earlier patch that
> > attempted to reduce the use of gbpages. This patch series fixes that
> > problem and restores my earlier patch.
> >
> > The following 2 commits caused the EFI config table, and the CC_BLOB
> > entry in that table, to be accessed when enabling SEV at kernel
> > startup.
> >
> > commit ec1c66af3a30 ("x86/compressed/64: Detect/setup SEV/SME features
> > earlier during boot")
> > commit c01fce9cef84 ("x86/compressed: Add SEV-SNP feature
> > detection/setup")
> >
> > These accesses happen before the new kernel establishes its own
> > identity map, and before establishing a routine to handle page faults.
> > But the areas referenced are not explicitly added to the kexec
> > identity map.
> >
> > This goes unnoticed when these areas happen to be placed close enough
> > to others areas that are explicitly added to the identity map, but
> > that is not always the case.
> >
> > Under certain conditions, for example Intel Atom processors that don't
> > support 1GB pages, it was found that these areas don't end up mapped,
> > and the SEV initialization code causes an unrecoverable page fault,
> > and the kexec fails.
>
> What does Intel Atom have to do with SEV?!
The Atom was the prominent example of a platform that the code
introduced for SEV broke. Unfortunately, the fix currently
implemented leaves things still broken for actual AMD SEV capable
processors when nogbpages is used, and this problem is the reason for
the apparent regression when my reduce-use-of-gbpages patch was
accepted (later removed).
Tau Liu's original patch fixed this problem, but was not accepted.
The patch that was accepted does not fix this.
> > Tau Liu had offered a patch to put the config table into the kexec
> > identity map to avoid this problem:
> >
> > https://lore.kernel.org/all/20230601072043.24439-1-ltao@redhat.com/
> >
> > But the community chose instead to avoid referencing this memory on
> > non-AMD systems where the problem was reported.
> >
> > commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
> > on non-AMD hardware")
> >
> > I later wanted to make a different change to kexec identity map
> > creation, and had this patch accepted:
> >
> > commit d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.")
> >
> > but it quickly needed to be reverted because of problems on AMD systems.
> >
> > The reported regression problems on AMD systems were due to the above
> > mentioned references to the EFI config table. In fact, on the same
> > systems, the "nogbpages" command line option breaks kexec as well.
> >
> > So I resubmit Tau Liu's original patch that maps the EFI config
> > table, add an additional patch by me that ensures that the CC blob is
> > also mapped (if present), and also resubmit my earlier patch to use
> > gpbages only when a full GB of space is requested to be mapped.
> >
> > I do not advocate for removing the earlier, non-AMD fix. With kexec,
> > two different kernel versions can be in play, and the earlier fix
> > still covers non-AMD systems when the kexec'd-from kernel doesn't have
> > these patches applied.
> >
> > All three of the people who reported regression with my earlier patch
> > have retested with this patch series and found it to work where my
> > single patch previously did not. With current kernels, all fail to
> > kexec when "nogbpages" is on the command line, but all succeed with
> > "nogbpages" after the series is applied.
> >
> > Tao Liu (1):
> > x86/kexec: Add EFI config table identity mapping for kexec kernel
> >
> > Steve Wahl (2):
> > x86/kexec: Add EFI Confidential Computing blob to kexec identity
> > mapping.
> > x86/mm/ident_map: Use gbpages only where full GB page should be
> > mapped.
> >
> > arch/x86/kernel/machine_kexec_64.c | 82 ++++++++++++++++++++++++++++--
> > arch/x86/mm/ident_map.c | 23 +++++++--
> > 2 files changed, 95 insertions(+), 10 deletions(-)
>
> Anyway, + Ashish who's been dealing with SNP kexec. We have identified one EFI
> issue so far:
>
> https://lore.kernel.org/r/20240612135638.298882-2-ardb%2Bgit@google.com
>
> You could give it a try and report back.
I will look at it, but a cursory inspection doesn't show anything
that affects what I'm talking about here.
Thanks!
--> Steve
--
Steve Wahl, Hewlett Packard Enterprise
Powered by blists - more mailing lists