[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YKa4I0cs/8lyy0fN@google.com>
Date: Thu, 20 May 2021 19:27:31 +0000
From: Sean Christopherson <seanjc@...gle.com>
To: Tom Lendacky <thomas.lendacky@....com>
Cc: Peter Gonda <pgonda@...gle.com>, kvm list <kvm@...r.kernel.org>,
linux-kernel@...r.kernel.org, x86@...nel.org,
Paolo Bonzini <pbonzini@...hat.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Borislav Petkov <bp@...en8.de>, Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Brijesh Singh <brijesh.singh@....com>
Subject: Re: [PATCH] KVM: SVM: Do not terminate SEV-ES guests on GHCB
validation failure
On Thu, May 20, 2021, Sean Christopherson wrote:
> On Mon, May 17, 2021, Tom Lendacky wrote:
> > On 5/14/21 6:06 PM, Peter Gonda wrote:
> > > On Fri, May 14, 2021 at 1:22 PM Tom Lendacky <thomas.lendacky@....com> wrote:
> > >>
> > >> Currently, an SEV-ES guest is terminated if the validation of the VMGEXIT
> > >> exit code and parameters fail. Since the VMGEXIT instruction can be issued
> > >> from userspace, even though userspace (likely) can't update the GHCB,
> > >> don't allow userspace to be able to kill the guest.
> > >>
> > >> Return a #GP request through the GHCB when validation fails, rather than
> > >> terminating the guest.
> > >
> > > Is this a gap in the spec? I don't see anything that details what
> > > should happen if the correct fields for NAE are not set in the first
> > > couple paragraphs of section 4 'GHCB Protocol'.
> >
> > No, I don't think the spec needs to spell out everything like this. The
> > hypervisor is free to determine its course of action in this case.
>
> The hypervisor can decide whether to inject/return an error or kill the guest,
> but what errors can be returned and how they're returned absolutely needs to be
> ABI between guest and host, and to make the ABI vendor agnostic the GHCB spec
> is the logical place to define said ABI.
>
> For example, "injecting" #GP if the guest botched the GHCB on #VMGEXIT(CPUID) is
> completely nonsensical. As is, a Linux guest appears to blindly forward the #GP,
> which means if something does go awry KVM has just made debugging the guest that
> much harder, e.g. imagine the confusion that will ensue if the end result is a
> SIGBUS to userspace on CPUID.
>
> There needs to be an explicit error code for "you gave me bad data", otherwise
> we're signing ourselves up for future pain.
More concretely, I think the best course of action is to define a new return code
in SW_EXITINFO1[31:0], e.g. '2', with additional information in SW_EXITINFO2.
In theory, an old-but-sane guest will interpret the unexpected return code as
fatal to whatever triggered the #VMGEXIT, e.g. SIGBUS to userspace. Unfortunately
Linux isn't sane because sev_es_ghcb_hv_call() assumes any non-'1' result means
success, but that's trivial to fix and IMO should be fixed irrespective of where
this goes.
Powered by blists - more mailing lists