lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 17 May 2021 19:53:05 -0700
From:   "Kuppuswamy, Sathyanarayanan" 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>
To:     Dan Williams <dan.j.williams@...el.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Tony Luck <tony.luck@...el.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Kirill Shutemov <kirill.shutemov@...ux.intel.com>,
        Kuppuswamy Sathyanarayanan <knsathya@...nel.org>,
        Raj Ashok <ashok.raj@...el.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Sean Christopherson <sean.j.christopherson@...el.com>,
        Kai Huang <kai.huang@...el.com>
Subject: Re: [RFC v2-fix 1/1] x86/boot: Add a trampoline for APs booting in
 64-bit mode



On 5/17/21 7:06 PM, Dan Williams wrote:
> I notice that you have [RFC v2-fix 1/1] as the prefix for this patch.
> b4 recently gained support for partial series re-rolls [1], but I
> think you would need to bump the version number [RFC PATCH v3 21/32]
> and maintain the patch numbering. In this case with changes moving
> between patches, and those other patches being squashed any chance of
> automated reconstruction of this series is likely lost.

Ok. I will make sure to bump the version in next partial re-roll.

If I am fixing this patch as per your comments, do I need bump the
patch version for it as well?

> 
> Just wanted to note that for future reference in case you were hoping
> to avoid resending full series in the future. For now, some more
> comments below:

Thanks.

> 
> [1]: https://lore.kernel.org/tools/20210517161317.teawoh5qovxpmqdc@nitro.local/
> 
> On Mon, May 17, 2021 at 5:54 PM Kuppuswamy Sathyanarayanan
> <sathyanarayanan.kuppuswamy@...ux.intel.com> wrote:
>>
>> From: Sean Christopherson <sean.j.christopherson@...el.com>
>>
>> Add a trampoline for booting APs in 64-bit mode via a software handoff
>> with BIOS, and use the new trampoline for the ACPI MP wake protocol used
>> by TDX. You can find MADT MP wake protocol details in ACPI specification
>> r6.4, sec 5.2.12.19.
>>
>> Extend the real mode IDT pointer by four bytes to support LIDT in 64-bit
>> mode.  For the GDT pointer, create a new entry as the existing storage
>> for the pointer occupies the zero entry in the GDT itself.
>>
>> Reported-by: Kai Huang <kai.huang@...el.com>
>> Signed-off-by: Sean Christopherson <sean.j.christopherson@...el.com>
>> Reviewed-by: Andi Kleen <ak@...ux.intel.com>
>> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
>> ---
>>
>> Changes since RFC v2:
>>   * Removed X86_CR0_NE and EFER related changes from this changes
> 
> This was only partially done, see below...
> 
>>     and moved it to patch titled "x86/boot: Avoid #VE during
>>     boot for TDX platforms"
>>   * Fixed commit log as per Dan's suggestion.
>>   * Added inline get_trampoline_start_ip() to set start_ip.
> 
> You also added a comment to tr_idt, but didn't mention it here, so I
> went to double check. Please take care to document all changes to the
> patch from the previous review.

Ok. I will make sure change log is current.

> 
>>
>>   arch/x86/boot/compressed/pgtable.h       |  2 +-
>>   arch/x86/include/asm/realmode.h          | 10 +++++++
>>   arch/x86/kernel/smpboot.c                |  2 +-
>>   arch/x86/realmode/rm/header.S            |  1 +
>>   arch/x86/realmode/rm/trampoline_64.S     | 38 ++++++++++++++++++++++++
>>   arch/x86/realmode/rm/trampoline_common.S |  7 ++++-
>>   6 files changed, 57 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/boot/compressed/pgtable.h b/arch/x86/boot/compressed/pgtable.h
>> index 6ff7e81b5628..cc9b2529a086 100644
>> --- a/arch/x86/boot/compressed/pgtable.h
>> +++ b/arch/x86/boot/compressed/pgtable.h
>> @@ -6,7 +6,7 @@
>>   #define TRAMPOLINE_32BIT_PGTABLE_OFFSET        0
>>
>>   #define TRAMPOLINE_32BIT_CODE_OFFSET   PAGE_SIZE
>> -#define TRAMPOLINE_32BIT_CODE_SIZE     0x70
>> +#define TRAMPOLINE_32BIT_CODE_SIZE     0x80
>>
>>   #define TRAMPOLINE_32BIT_STACK_END     TRAMPOLINE_32BIT_SIZE
>>
>> diff --git a/arch/x86/include/asm/realmode.h b/arch/x86/include/asm/realmode.h
>> index 5db5d083c873..3328c8edb200 100644
>> --- a/arch/x86/include/asm/realmode.h
>> +++ b/arch/x86/include/asm/realmode.h
>> @@ -25,6 +25,7 @@ struct real_mode_header {
>>          u32     sev_es_trampoline_start;
>>   #endif
>>   #ifdef CONFIG_X86_64
>> +       u32     trampoline_start64;
>>          u32     trampoline_pgd;
>>   #endif
>>          /* ACPI S3 wakeup */
>> @@ -88,6 +89,15 @@ static inline void set_real_mode_mem(phys_addr_t mem)
>>          real_mode_header = (struct real_mode_header *) __va(mem);
>>   }
>>
>> +static inline unsigned long get_trampoline_start_ip(void)
> 
> I'd prefer this helper take a 'struct real_mode_header *rmh' as an
> argument rather than assume a global variable.

I am fine with it. But existing inline functions also directly read/writes
the real_mode_header. So I just followed the same format.

I will fix this in next version.

> 
>> +{
>> +#ifdef CONFIG_X86_64
>> +        if (is_tdx_guest())
>> +                return real_mode_header->trampoline_start64;
>> +#endif
>> +       return real_mode_header->trampoline_start;
>> +}
>> +
>>   void reserve_real_mode(void);
>>
>>   #endif /* __ASSEMBLY__ */
>> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
>> index 16703c35a944..0b4dff5e67a9 100644
>> --- a/arch/x86/kernel/smpboot.c
>> +++ b/arch/x86/kernel/smpboot.c
>> @@ -1031,7 +1031,7 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle,
>>                         int *cpu0_nmi_registered)
>>   {
>>          /* start_ip had better be page-aligned! */
>> -       unsigned long start_ip = real_mode_header->trampoline_start;
>> +       unsigned long start_ip = get_trampoline_start_ip();
>>
>>          unsigned long boot_error = 0;
>>          unsigned long timeout;
>> diff --git a/arch/x86/realmode/rm/header.S b/arch/x86/realmode/rm/header.S
>> index 8c1db5bf5d78..2eb62be6d256 100644
>> --- a/arch/x86/realmode/rm/header.S
>> +++ b/arch/x86/realmode/rm/header.S
>> @@ -24,6 +24,7 @@ SYM_DATA_START(real_mode_header)
>>          .long   pa_sev_es_trampoline_start
>>   #endif
>>   #ifdef CONFIG_X86_64
>> +       .long   pa_trampoline_start64
>>          .long   pa_trampoline_pgd;
>>   #endif
>>          /* ACPI S3 wakeup */
>> diff --git a/arch/x86/realmode/rm/trampoline_64.S b/arch/x86/realmode/rm/trampoline_64.S
>> index 84c5d1b33d10..754f8d2ac9e8 100644
>> --- a/arch/x86/realmode/rm/trampoline_64.S
>> +++ b/arch/x86/realmode/rm/trampoline_64.S
>> @@ -161,6 +161,19 @@ SYM_CODE_START(startup_32)
>>          ljmpl   $__KERNEL_CS, $pa_startup_64
>>   SYM_CODE_END(startup_32)
>>
>> +SYM_CODE_START(pa_trampoline_compat)
>> +       /*
>> +        * In compatibility mode.  Prep ESP and DX for startup_32, then disable
>> +        * paging and complete the switch to legacy 32-bit mode.
>> +        */
>> +       movl    $rm_stack_end, %esp
>> +       movw    $__KERNEL_DS, %dx
>> +
>> +       movl    $(X86_CR0_NE | X86_CR0_PE), %eax
> 
> Before this patch the startup path did not touch X86_CR0_NE. I assume
> it was added opportunistically for the TDX case? If it is to stay in
> this patch it deserves a code comment / mention in the changelog, or
> it needs to move to the other patch that fixes up the CR0 setup for
> TDX.

I will move X86_CR0_NE related update to the patch that has other
X86_CR0_NE related updates.

> 
> 
>> +       movl    %eax, %cr0
>> +       ljmpl   $__KERNEL32_CS, $pa_startup_32
>> +SYM_CODE_END(pa_trampoline_compat)
>> +
>>          .section ".text64","ax"
>>          .code64
>>          .balign 4
>> @@ -169,6 +182,20 @@ SYM_CODE_START(startup_64)
>>          jmpq    *tr_start(%rip)
>>   SYM_CODE_END(startup_64)
>>
>> +SYM_CODE_START(trampoline_start64)
>> +       /*
>> +        * APs start here on a direct transfer from 64-bit BIOS with identity
>> +        * mapped page tables.  Load the kernel's GDT in order to gear down to
>> +        * 32-bit mode (to handle 4-level vs. 5-level paging), and to (re)load
>> +        * segment registers.  Load the zero IDT so any fault triggers a
>> +        * shutdown instead of jumping back into BIOS.
>> +        */
>> +       lidt    tr_idt(%rip)
>> +       lgdt    tr_gdt64(%rip)
>> +
>> +       ljmpl   *tr_compat(%rip)
>> +SYM_CODE_END(trampoline_start64)
>> +
>>          .section ".rodata","a"
>>          # Duplicate the global descriptor table
>>          # so the kernel can live anywhere
>> @@ -182,6 +209,17 @@ SYM_DATA_START(tr_gdt)
>>          .quad   0x00cf93000000ffff      # __KERNEL_DS
>>   SYM_DATA_END_LABEL(tr_gdt, SYM_L_LOCAL, tr_gdt_end)
>>
>> +SYM_DATA_START(tr_gdt64)
>> +       .short  tr_gdt_end - tr_gdt - 1 # gdt limit
>> +       .long   pa_tr_gdt
>> +       .long   0
>> +SYM_DATA_END(tr_gdt64)
>> +
>> +SYM_DATA_START(tr_compat)
>> +       .long   pa_trampoline_compat
>> +       .short  __KERNEL32_CS
>> +SYM_DATA_END(tr_compat)
>> +
>>          .bss
>>          .balign PAGE_SIZE
>>   SYM_DATA(trampoline_pgd, .space PAGE_SIZE)
>> diff --git a/arch/x86/realmode/rm/trampoline_common.S b/arch/x86/realmode/rm/trampoline_common.S
>> index 5033e640f957..ade7db208e4e 100644
>> --- a/arch/x86/realmode/rm/trampoline_common.S
>> +++ b/arch/x86/realmode/rm/trampoline_common.S
>> @@ -1,4 +1,9 @@
>>   /* SPDX-License-Identifier: GPL-2.0 */
>>          .section ".rodata","a"
>>          .balign 16
>> -SYM_DATA_LOCAL(tr_idt, .fill 1, 6, 0)
>> +
>> +/* .fill cannot be used for size > 8. So use short and quad */
> 
> If there is to be a comment here it should be to clarify why @tr_idt
> is 10 bytes, not necessarily a quirk of the assembler.

Got it. I will fix the comment or remove it.

> 
>> +SYM_DATA_START_LOCAL(tr_idt)
> 
> The .fill restriction is only for @size, not @repeat. So, what's wrong
> with SYM_DATA_LOCAL(tr_idt, .fill 2, 5, 0)?

Any reason to prefer above change over previous code ?

SYM_DATA_START_LOCAL(tr_idt)
         .short  0
         .quad   0
SYM_DATA_END(tr_idt)

> 

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ