linux-kernel - Re: [PATCH v6 08/42] x86/sev-es: initialize sev

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211020161023.hzbj53ehmzjrt4xd@amd.com>
Date:   Wed, 20 Oct 2021 11:10:23 -0500
From:   Michael Roth <michael.roth@....com>
To:     Borislav Petkov <bp@...en8.de>
CC:     Brijesh Singh <brijesh.singh@....com>, <x86@...nel.org>,
        <linux-kernel@...r.kernel.org>, <kvm@...r.kernel.org>,
        <linux-efi@...r.kernel.org>, <platform-driver-x86@...r.kernel.org>,
        <linux-coco@...ts.linux.dev>, <linux-mm@...ck.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Joerg Roedel <jroedel@...e.de>,
        Tom Lendacky <thomas.lendacky@....com>,
        "H. Peter Anvin" <hpa@...or.com>, Ard Biesheuvel <ardb@...nel.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Sean Christopherson <seanjc@...gle.com>,
        "Vitaly Kuznetsov" <vkuznets@...hat.com>,
        Jim Mattson <jmattson@...gle.com>,
        "Andy Lutomirski" <luto@...nel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Sergio Lopez <slp@...hat.com>, Peter Gonda <pgonda@...gle.com>,
        "Peter Zijlstra" <peterz@...radead.org>,
        Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
        David Rientjes <rientjes@...gle.com>,
        Dov Murik <dovmurik@...ux.ibm.com>,
        Tobin Feldman-Fitzthum <tobin@....com>,
        Vlastimil Babka <vbabka@...e.cz>,
        "Kirill A . Shutemov" <kirill@...temov.name>,
        Andi Kleen <ak@...ux.intel.com>,
        "Dr . David Alan Gilbert" <dgilbert@...hat.com>,
        <tony.luck@...el.com>, <marcorr@...gle.com>,
        <sathyanarayanan.kuppuswamy@...ux.intel.com>
Subject: Re: [PATCH v6 08/42] x86/sev-es: initialize sev_status/features
 within #VC handler

On Mon, Oct 18, 2021 at 09:18:13PM +0200, Borislav Petkov wrote:
> On Mon, Oct 18, 2021 at 01:40:03PM -0500, Michael Roth wrote:
> > If CPUID has lied, that would result in a #GP, rather than a controlled
> > termination in the various checkers/callers. The latter is easier to
> > debug.
> > 
> > Additionally, #VC is arguably a better indicator of SEV MSR availability
> > for SEV-ES/SEV-SNP guests, since it is only generated by ES/SNP hardware
> > and doesn't rely directly on hypervisor/EFI-provided CPUID values. It
> > doesn't work for SEV guests, but I don't think it's a bad idea to allow
> > SEV-ES/SEV-SNP guests to initialize sev_status in #VC handler to make
> > use of the added assurance.
> 

[Sorry for the wall of text, just trying to work through everything.]

> Ok, let's take a step back and analyze what we're trying to solve first.
> So I'm looking at sme_enable():

I'm not sure if this is pertaining to using the CPUID table prior to
sme_enable(), or just the #VC-based SEV MSR read. The following comments
assume the former. If that assumption is wrong you can basically ignore
the rest of this email :)

[The #VC-based SEV MSR read is not necessary for anything in sme_enable(),
it's simply a way to determine whether the guest is an SNP guest, without
any reliance on CPUID, which seemed useful in the context of doing some
additional sanity checks against the SNP CPUID table and determining that
it's appropriate to use it early on (rather than just trust that this is an
SNP guest by virtue of the CC blob being present, and then failing later
once sme_enable() checks for the SNP feature bits through the normal
mechanism, as was done in v5).]

> 
> 1. Code checks SME/SEV support leaf. HV lies and says there's none. So
> guest doesn't boot encrypted. Oh well, not a big deal, the cloud vendor
> won't be able to give confidentiality to its users => users go away or
> do unencrypted like now.
> 
> Problem is solved by political and economical pressure.
> 
> 2. Check SEV and SME bit. HV lies here. Oh well, same as the above.

I'd be worried about the possibility that, through some additional exploits
or failures in the attestation flow, a guest owner was tricked into booting
unencrypted on a compromised host and exposing their secrets. Their
attestation process might even do some additional CPUID sanity checks, which
would at the point be via the SNP CPUID table and look legitimate, unaware
that the kernel didn't actually use the SNP CPUID table until after
0x8000001F was parsed (if we were to only initialize it after/as-part-of
sme_enable()).

Fortunately in this scenario I think the guest kernel actually would fail to
boot due to the SNP hardware unconditionally treating code/page tables as
encrypted pages. I tested some of these scenarios just to check, but not
all, and I still don't feel confident enough about it to say that there's
not some way to exploit this by someone who is more clever/persistant than
me.

> 
> 3. HV lies about 1. and 2. but says that SME/SEV is supported.
> 
> Guest attempts to read the MSR Guest explodes due to the #GP. The same
> political/economical pressure thing happens.

That's seems likely, but maybe some future hardware bug, or some other
exploit, makes it possible to intercept that MSR read? I don't know, but
if that particular branch of execution can be made less likely by utilizing
SNP CPUID validation I think it makes sense to make use of it.

> 
> If the MSR is really there, we've landed at the place where we read the
> SEV MSR. Moment of truth - SEV/SNP guests have a communication protocol
> which is independent from the HV and all good.

At which point we then switch to using the CPUID table? But at that
point all the previous CPUID checks, both SEV-related/non-SEV-related,
are now possibly not consistent with what's in the CPUID table. Do we
then revalidate? Even a non-malicious hypervisor might provide
inconsistent values between the two sources due to bugs, or SNP
validation suppressing certain feature bits that hypervisor otherwise
exposes, etc. Now all the code after sme_enable() can potentially take
unexpected execution paths, where post-sme_enable() code makes
assumptions about pre-sme_enable() checks that may no longer hold true.

Also, it would be useful from an attestation perspective that the CPUID
bits visible to userspace correspond to what the kernel used during boot,
which wouldn't necessarily be the case if hypervisor-provided values were
used during early boot and potentially put the kernel into some unexpected
state that could persist beyond the point of attestation.

Code-wise, thanks in large part to your suggestions, it really isn't all
that much more complicated to hook in the CPUID table lookup in the #VC
handlers (which are already needed anyway for SEV-ES) early on so all
these checks are against the same trusted (or more-trusted at least)
CPUID source.

> 
> Now, which case am I missing here which justifies the need to do those
> acrobatics of causing #VCs just to detect the SEV MSR?

There are a few more places where cpuid is utilized prior to
sme_enable():

# In boot/compressed
paging_prepare():
  ecx = cpuid(7, 0) # SNP-verified against host values
  # check ecx for LA57

# In boot/compressed and kernel proper
verify_cpu():
  eax, ebx, ecx, edx = cpuid(0, 0) # SNP-verified against host values
  # check eax for range > 0
  # check ebx, ecx, edx for "AuthenticAMD" or "GenuineIntel"

if_amd:
  edx = cpuid(1, 0) # SNP-verified against host values
  # check edx feature bits against REQUIRED_MASK0 (PAE|FPU|PSE|etc.)

  eax = cpuid(0x80000001, 0) # SNP-verified against host values
  # check eax against REQUIRED_MASK1 (LM|3DNOW)

  edx = cpuid(1, 0) # SNP-verified against host values
  # check eax against SSE_MASK
  # if not set, try to force it on via MSR_K7_HWCR if this is an AMD CPU
  # if forcing fails, report no_longmode available

if_intel:
  # completely different stuff

It's possible that various lies about the values checked for in
REQUIRED_MASK0/REQUIRED_MASK1, LA57 enablement, etc., can be audited in
similar fashion as you've done above to find nothing concerning, but
what about 5 years from now? And these are all checks/configuration that
can put the kernel in unexpected states that persist beyond the point of
attestation, where we really need to care about the possible effects. If
SNP CPUID validation isn't utilized until after-the-fact, we'd end up
not utilizing it for some of the more 'interesting' CPUID bits.

It's also worth noting that TDX guards against most of this through
CPUID virtualization, where hardware/microcode provides similar
validation for these sorts of CPUID bits in early boot. It's only because
the SEV-SNP CPUID 'virtualization' lives in the guest code that we have to
deal with the additional complexity of initializing the CPUID table early
on. But if both platforms are capable of providing similar assurances then
it seems worthwhile to pursue that.

> acrobatics of causing #VCs just to detect the SEV MSR?

The CPUID calls in snp_cpuid_init() weren't added specifically to induce
the #VC-based SEV MSR read, they were added only because I thought the
gist of your earlier suggestions were to do more validation against the
CPUID table advertised by EFI rather than the v5 approach of deferring
them till later after sev_status gets set by sme_enable(), and then
failing later if turned out that SNP CPUID feature bit wasn't set by
sme_enable(). I thought this was motivated by a desire to be more
paranoid about what EFI provides, so once I had the cpuid checks added
in snp_cpuid_init() it seemed like a logical step to further
sanity-check the SNP CPUID bit using a mechanism that was completely
independent of the CPUID table.

I'm not dead set on that at all however, it was only added based on me
[mis-]interpreting your comments as a desire to be less trusting of EFI
as to whether this is an SNP guest or not. But I also don't think it's
a good idea to not utilize the CPUID table until after sme_enable(),
based on the above reasons.

What if we simply add a check in sme_enable() that terminates the guest
if the cc_blob/cpuid table is provided, but the CPUID/MSR checks in
sme_enable() determine that this isn't an SNP guest? That would be similar
to the v5 approach, but in a less roundabout way, and then the cpuid/MSR
checks could be dropped from snp_cpuid_init().

If we did decide that it is useful to use #VC-based initialization of
sev_status there, it would all be self-contained there (but again, that's
a separate thing that I don't have a strong opinion on).

Thanks,

Mike

> 
> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&amp;data=04%7C01%7Cmichael.roth%40amd.com%7Cddb8c27d71794244176308d9926c094d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637701815061371715%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=LUArcQqb%2F0WFlworDbZOClXhgYqEGje364fpBycixUg%3D&amp;reserved=0