lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yqjm/b3deMlxxePh@google.com>
Date:   Tue, 14 Jun 2022 19:52:29 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Tom Lendacky <thomas.lendacky@....com>
Cc:     Michael Roth <michael.roth@....com>, x86@...nel.org,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        linux-efi@...r.kernel.org, platform-driver-x86@...r.kernel.org,
        linux-coco@...ts.linux.dev, linux-mm@...ck.org,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Joerg Roedel <jroedel@...e.de>,
        "H. Peter Anvin" <hpa@...or.com>, Ard Biesheuvel <ardb@...nel.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Jim Mattson <jmattson@...gle.com>,
        Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Sergio Lopez <slp@...hat.com>, Peter Gonda <pgonda@...gle.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
        David Rientjes <rientjes@...gle.com>,
        Dov Murik <dovmurik@...ux.ibm.com>,
        Tobin Feldman-Fitzthum <tobin@....com>,
        Borislav Petkov <bp@...en8.de>,
        Vlastimil Babka <vbabka@...e.cz>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Andi Kleen <ak@...ux.intel.com>,
        "Dr . David Alan Gilbert" <dgilbert@...hat.com>,
        brijesh.ksingh@...il.com, tony.luck@...el.com, marcorr@...gle.com,
        sathyanarayanan.kuppuswamy@...ux.intel.com
Subject: Re: [PATCH v12 19/46] x86/kernel: Make the .bss..decrypted section
 shared in RMP table

On Tue, Jun 14, 2022, Tom Lendacky wrote:
> On 6/14/22 11:13, Sean Christopherson wrote:
> > > > > This breaks SME on Rome and Milan when compiling with clang-13.  I haven't been
> > > > > able to figure out exactly what goes wrong.  printk isn't functional at this point,
> > > > > and interactive debug during boot on our test systems is beyond me.  I can't even
> > > > > verify that the bug is specific to clang because the draconian build system for our
> > > > > test systems apparently is stuck pointing at gcc-4.9.
> > > > > 
> > > > > I suspect the issue is related to relocation and/or encrypting memory, as skipping
> > > > > the call to early_snp_set_memory_shared() if SNP isn't active masks the issue.
> > > > > I've dug through the assembly and haven't spotted a smoking gun, e.g. no obvious
> > > > > use of absolute addresses.
> > > > > 
> > > > > Forcing a VM through the same path doesn't fail.  I can't test an SEV guest at the
> > > > > moment because INIT_EX is also broken.
> > > > 
> > > > The SEV INIT_EX was a PEBKAC issue.  An SEV guest boots just fine with a clang-built
> > > > kernel, so either it's a finnicky relocation issue or something specific to SME.
> > > 
> > > I just built and booted 5.19-rc2 with clang-13 and SME enabled without issue:
> > > 
> > > [    4.118226] Memory Encryption Features active: AMD SME
> > 
> > Phooey.
> > 
> > > Maybe something with your kernel config? Can you send me your config?
> > 
> > Attached.  If you can't repro, I'll find someone on our end to work on this.
> 
> I was able to repro. It dies in the cc_platform_has() code, where it is
> trying to do an indirect jump based on the attribute (actually in the
> amd_cc_platform_has() which I think has been optimized in):
> 
> bool cc_platform_has(enum cc_attr attr)

...

> ffffffff81002160:       ff 24 c5 c0 01 00 82    jmp    *-0x7dfffe40(,%rax,8)
> 
> This last line is what causes the reset. I'm guessing that the jump isn't
> valid at this point because we are running in identity mapped mode and not
> with a kernel virtual address at this point.
> 
> Trying to see what the difference was between your config and mine, the
> indirect jump lead me to check the setting of CONFIG_RETPOLINE. Your config
> did not have it enabled, so I set CONFIG_RETPOLINE=y, and with that, the
> kernel boots successfully.

That would explain why my VMs didn't fail, I build those kernels with CONFIG_RETPOLINE=y.

> With retpolines, the code is completely different around here:

...

> I'm not sure if there's a way to remove the jump table optimization for
> the arch/x86/coco/core.c file when retpolines aren't configured.

And for post-boot I don't think we'd want to disable any such optimizations.

A possibled "fix" would be to do what sme_encrypt_kernel() does and just query
sev_status directly.  But even that works, the fragility of the boot code is
terrifying :-(  I can't think of any clever solutions though.

Many thanks again Tom!

---
 arch/x86/include/asm/sev.h |  4 ++++
 arch/x86/kernel/head64.c   | 10 +++++++---
 arch/x86/kernel/sev.c      | 16 +++++++++++-----
 3 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 19514524f0f8..701c561fdf08 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -193,6 +193,8 @@ static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
 void setup_ghcb(void);
 void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
 					 unsigned int npages);
+void __init __early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+					  unsigned int npages);
 void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
 					unsigned int npages);
 void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
@@ -214,6 +216,8 @@ static inline void setup_ghcb(void) { }
 static inline void __init
 early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
 static inline void __init
+__early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
+static inline void __init
 early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
 static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
 static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npages) { }
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index bd4a34100ed0..5efab0d8e49d 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -127,7 +127,9 @@ static bool __head check_la57_support(unsigned long physaddr)
 }
 #endif

-static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdval_t *pmd)
+static unsigned long __head sme_postprocess_startup(struct boot_params *bp,
+						    pmdval_t *pmd,
+						    unsigned long physaddr)
 {
 	unsigned long vaddr, vaddr_end;
 	int i;
@@ -156,7 +158,9 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdv
 			 * address but the kernel is currently running off of the identity
 			 * mapping so use __pa() to get a *currently* valid virtual address.
 			 */
-			early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);
+			if (sev_status & MSR_AMD64_SEV_SNP_ENABLED_BIT)
+				__early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr),
+							      PTRS_PER_PMD);

 			i = pmd_index(vaddr);
 			pmd[i] -= sme_get_me_mask();
@@ -316,7 +320,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
 	 */
 	*fixup_long(&phys_base, physaddr) += load_delta - sme_get_me_mask();

-	return sme_postprocess_startup(bp, pmd);
+	return sme_postprocess_startup(bp, pmd, physaddr);
 }

 /* Wipe all early page tables except for the kernel symbol map */
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index c05f0124c410..48966ecc520e 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -714,12 +714,9 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
 	pvalidate_pages(vaddr, npages, true);
 }

-void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
-					unsigned int npages)
+void __init __early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+					  unsigned int npages)
 {
-	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
-		return;
-
 	/* Invalidate the memory pages before they are marked shared in the RMP table. */
 	pvalidate_pages(vaddr, npages, false);

@@ -727,6 +724,15 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
 	early_set_pages_state(paddr, npages, SNP_PAGE_STATE_SHARED);
 }

+void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+					unsigned int npages)
+{
+	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+		return;
+
+	__early_snp_set_memory_shared(vaddr, paddr, npages);
+}
+
 void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
 {
 	unsigned long vaddr, npages;

base-commit: b13baccc3850ca8b8cccbf8ed9912dbaa0fdf7f3
--

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ