lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 14 Jun 2022 14:00:56 -0500
From:   Tom Lendacky <thomas.lendacky@....com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Michael Roth <michael.roth@....com>, x86@...nel.org,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        linux-efi@...r.kernel.org, platform-driver-x86@...r.kernel.org,
        linux-coco@...ts.linux.dev, linux-mm@...ck.org,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Joerg Roedel <jroedel@...e.de>,
        "H. Peter Anvin" <hpa@...or.com>, Ard Biesheuvel <ardb@...nel.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Jim Mattson <jmattson@...gle.com>,
        Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Sergio Lopez <slp@...hat.com>, Peter Gonda <pgonda@...gle.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
        David Rientjes <rientjes@...gle.com>,
        Dov Murik <dovmurik@...ux.ibm.com>,
        Tobin Feldman-Fitzthum <tobin@....com>,
        Borislav Petkov <bp@...en8.de>,
        Vlastimil Babka <vbabka@...e.cz>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Andi Kleen <ak@...ux.intel.com>,
        "Dr . David Alan Gilbert" <dgilbert@...hat.com>,
        brijesh.ksingh@...il.com, tony.luck@...el.com, marcorr@...gle.com,
        sathyanarayanan.kuppuswamy@...ux.intel.com
Subject: Re: [PATCH v12 19/46] x86/kernel: Make the .bss..decrypted section
 shared in RMP table

On 6/14/22 11:13, Sean Christopherson wrote:
> On Tue, Jun 14, 2022, Tom Lendacky wrote:
>> On 6/14/22 10:43, Sean Christopherson wrote:
>>> On Tue, Jun 14, 2022, Sean Christopherson wrote:
>>>> s/Brijesh/Michael
>>>>
>>>> On Mon, Mar 07, 2022, Brijesh Singh wrote:
>>>>> The encryption attribute for the .bss..decrypted section is cleared in the
>>>>> initial page table build. This is because the section contains the data
>>>>> that need to be shared between the guest and the hypervisor.
>>>>>
>>>>> When SEV-SNP is active, just clearing the encryption attribute in the
>>>>> page table is not enough. The page state need to be updated in the RMP
>>>>> table.
>>>>>
>>>>> Signed-off-by: Brijesh Singh <brijesh.singh@....com>
>>>>> ---
>>>>>    arch/x86/kernel/head64.c | 13 +++++++++++++
>>>>>    1 file changed, 13 insertions(+)
>>>>>
>>>>> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
>>>>> index 83514b9827e6..656d2f3e2cf0 100644
>>>>> --- a/arch/x86/kernel/head64.c
>>>>> +++ b/arch/x86/kernel/head64.c
>>>>> @@ -143,7 +143,20 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdv
>>>>>    	if (sme_get_me_mask()) {
>>>>>    		vaddr = (unsigned long)__start_bss_decrypted;
>>>>>    		vaddr_end = (unsigned long)__end_bss_decrypted;
>>>>> +
>>>>>    		for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
>>>>> +			/*
>>>>> +			 * On SNP, transition the page to shared in the RMP table so that
>>>>> +			 * it is consistent with the page table attribute change.
>>>>> +			 *
>>>>> +			 * __start_bss_decrypted has a virtual address in the high range
>>>>> +			 * mapping (kernel .text). PVALIDATE, by way of
>>>>> +			 * early_snp_set_memory_shared(), requires a valid virtual
>>>>> +			 * address but the kernel is currently running off of the identity
>>>>> +			 * mapping so use __pa() to get a *currently* valid virtual address.
>>>>> +			 */
>>>>> +			early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);
>>>>
>>>> This breaks SME on Rome and Milan when compiling with clang-13.  I haven't been
>>>> able to figure out exactly what goes wrong.  printk isn't functional at this point,
>>>> and interactive debug during boot on our test systems is beyond me.  I can't even
>>>> verify that the bug is specific to clang because the draconian build system for our
>>>> test systems apparently is stuck pointing at gcc-4.9.
>>>>
>>>> I suspect the issue is related to relocation and/or encrypting memory, as skipping
>>>> the call to early_snp_set_memory_shared() if SNP isn't active masks the issue.
>>>> I've dug through the assembly and haven't spotted a smoking gun, e.g. no obvious
>>>> use of absolute addresses.
>>>>
>>>> Forcing a VM through the same path doesn't fail.  I can't test an SEV guest at the
>>>> moment because INIT_EX is also broken.
>>>
>>> The SEV INIT_EX was a PEBKAC issue.  An SEV guest boots just fine with a clang-built
>>> kernel, so either it's a finnicky relocation issue or something specific to SME.
>>
>> I just built and booted 5.19-rc2 with clang-13 and SME enabled without issue:
>>
>> [    4.118226] Memory Encryption Features active: AMD SME
> 
> Phooey.
> 
>> Maybe something with your kernel config? Can you send me your config?
> 
> Attached.  If you can't repro, I'll find someone on our end to work on this.

I was able to repro. It dies in the cc_platform_has() code, where it is
trying to do an indirect jump based on the attribute (actually in the
amd_cc_platform_has() which I think has been optimized in):

bool cc_platform_has(enum cc_attr attr)
{
ffffffff81002140:       55                      push   %rbp
ffffffff81002141:       48 89 e5                mov    %rsp,%rbp
         switch (vendor) {
ffffffff81002144:       8b 05 c6 e9 3a 01       mov    0x13ae9c6(%rip),%eax        # ffffffff823b0b10 <vendor>
ffffffff8100214a:       83 f8 03                cmp    $0x3,%eax
ffffffff8100214d:       74 25                   je     ffffffff81002174 <cc_platform_has+0x34>
ffffffff8100214f:       83 f8 02                cmp    $0x2,%eax
ffffffff81002152:       74 2f                   je     ffffffff81002183 <cc_platform_has+0x43>
ffffffff81002154:       83 f8 01                cmp    $0x1,%eax
ffffffff81002157:       75 26                   jne    ffffffff8100217f <cc_platform_has+0x3f>
         switch (attr) {
ffffffff81002159:       83 ff 05                cmp    $0x5,%edi
ffffffff8100215c:       77 21                   ja     ffffffff8100217f <cc_platform_has+0x3f>
ffffffff8100215e:       89 f8                   mov    %edi,%eax
ffffffff81002160:       ff 24 c5 c0 01 00 82    jmp    *-0x7dfffe40(,%rax,8)

This last line is what causes the reset. I'm guessing that the jump isn't
valid at this point because we are running in identity mapped mode and not
with a kernel virtual address at this point.

Trying to see what the difference was between your config and mine, the
indirect jump lead me to check the setting of CONFIG_RETPOLINE. Your config
did not have it enabled, so I set CONFIG_RETPOLINE=y, and with that, the
kernel boots successfully. With retpolines, the code is completely different
around here:

bool cc_platform_has(enum cc_attr attr)
{
ffffffff81001f30:       55                      push   %rbp
ffffffff81001f31:       48 89 e5                mov    %rsp,%rbp
         switch (vendor) {
ffffffff81001f34:       8b 05 26 8f 37 01       mov    0x1378f26(%rip),%eax        # ffffffff8237ae60 <vendor>
ffffffff81001f3a:       83 f8 03                cmp    $0x3,%eax
ffffffff81001f3d:       74 29                   je     ffffffff81001f68 <cc_platform_has+0x38>
ffffffff81001f3f:       83 f8 02                cmp    $0x2,%eax
ffffffff81001f42:       74 33                   je     ffffffff81001f77 <cc_platform_has+0x47>
ffffffff81001f44:       83 f8 01                cmp    $0x1,%eax
ffffffff81001f47:       75 2a                   jne    ffffffff81001f73 <cc_platform_has+0x43>
ffffffff81001f49:       31 c0                   xor    %eax,%eax
         switch (attr) {
ffffffff81001f4b:       83 ff 02                cmp    $0x2,%edi
ffffffff81001f4e:       7f 2f                   jg     ffffffff81001f7f <cc_platform_has+0x4f>
ffffffff81001f50:       85 ff                   test   %edi,%edi
ffffffff81001f52:       74 47                   je     ffffffff81001f9b <cc_platform_has+0x6b>
ffffffff81001f54:       83 ff 01                cmp    $0x1,%edi
ffffffff81001f57:       74 5b                   je     ffffffff81001fb4 <cc_platform_has+0x84>
ffffffff81001f59:       83 ff 02                cmp    $0x2,%edi
ffffffff81001f5c:       75 08                   jne    ffffffff81001f66 <cc_platform_has+0x36>
                 return sev_status & MSR_AMD64_SEV_ENABLED;
ffffffff81001f5e:       8a 05 44 3f 64 01       mov    0x1643f44(%rip),%al        # ffffffff82645ea8 <sev_status>
ffffffff81001f64:       24 01                   and    $0x1,%al
         case CC_VENDOR_HYPERV:
                 return hyperv_cc_platform_has(attr);
         default:
                 return false;
         }
}
ffffffff81001f66:       5d                      pop    %rbp
ffffffff81001f67:       c3                      ret
         switch (attr) {
ffffffff81001f68:       83 ff 07                cmp    $0x7,%edi
ffffffff81001f6b:       73 06                   jae    ffffffff81001f73 <cc_platform_has+0x43>
ffffffff81001f6d:       40 f6 c7 01             test   $0x1,%dil
ffffffff81001f71:       eb 07                   jmp    ffffffff81001f7a <cc_platform_has+0x4a>
ffffffff81001f73:       31 c0                   xor    %eax,%eax
}
ffffffff81001f75:       5d                      pop    %rbp
ffffffff81001f76:       c3                      ret
         return attr == CC_ATTR_GUEST_MEM_ENCRYPT;
ffffffff81001f77:       83 ff 02                cmp    $0x2,%edi
ffffffff81001f7a:       0f 94 c0                sete   %al
}
ffffffff81001f7d:       5d                      pop    %rbp
ffffffff81001f7e:       c3                      ret
         switch (attr) {
.
.
.

I'm not sure if there's a way to remove the jump table optimization for
the arch/x86/coco/core.c file when retpolines aren't configured.

Thanks,
Tom

> 
> Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ