[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <861a316c-09f6-5969-6238-e402fca917db@linux.intel.com>
Date: Mon, 17 May 2021 19:53:05 -0700
From: "Kuppuswamy, Sathyanarayanan"
<sathyanarayanan.kuppuswamy@...ux.intel.com>
To: Dan Williams <dan.j.williams@...el.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Andy Lutomirski <luto@...nel.org>,
Dave Hansen <dave.hansen@...el.com>,
Tony Luck <tony.luck@...el.com>,
Andi Kleen <ak@...ux.intel.com>,
Kirill Shutemov <kirill.shutemov@...ux.intel.com>,
Kuppuswamy Sathyanarayanan <knsathya@...nel.org>,
Raj Ashok <ashok.raj@...el.com>,
Sean Christopherson <seanjc@...gle.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Sean Christopherson <sean.j.christopherson@...el.com>,
Kai Huang <kai.huang@...el.com>
Subject: Re: [RFC v2-fix 1/1] x86/boot: Add a trampoline for APs booting in
64-bit mode
On 5/17/21 7:06 PM, Dan Williams wrote:
> I notice that you have [RFC v2-fix 1/1] as the prefix for this patch.
> b4 recently gained support for partial series re-rolls [1], but I
> think you would need to bump the version number [RFC PATCH v3 21/32]
> and maintain the patch numbering. In this case with changes moving
> between patches, and those other patches being squashed any chance of
> automated reconstruction of this series is likely lost.
Ok. I will make sure to bump the version in next partial re-roll.
If I am fixing this patch as per your comments, do I need bump the
patch version for it as well?
>
> Just wanted to note that for future reference in case you were hoping
> to avoid resending full series in the future. For now, some more
> comments below:
Thanks.
>
> [1]: https://lore.kernel.org/tools/20210517161317.teawoh5qovxpmqdc@nitro.local/
>
> On Mon, May 17, 2021 at 5:54 PM Kuppuswamy Sathyanarayanan
> <sathyanarayanan.kuppuswamy@...ux.intel.com> wrote:
>>
>> From: Sean Christopherson <sean.j.christopherson@...el.com>
>>
>> Add a trampoline for booting APs in 64-bit mode via a software handoff
>> with BIOS, and use the new trampoline for the ACPI MP wake protocol used
>> by TDX. You can find MADT MP wake protocol details in ACPI specification
>> r6.4, sec 5.2.12.19.
>>
>> Extend the real mode IDT pointer by four bytes to support LIDT in 64-bit
>> mode. For the GDT pointer, create a new entry as the existing storage
>> for the pointer occupies the zero entry in the GDT itself.
>>
>> Reported-by: Kai Huang <kai.huang@...el.com>
>> Signed-off-by: Sean Christopherson <sean.j.christopherson@...el.com>
>> Reviewed-by: Andi Kleen <ak@...ux.intel.com>
>> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
>> ---
>>
>> Changes since RFC v2:
>> * Removed X86_CR0_NE and EFER related changes from this changes
>
> This was only partially done, see below...
>
>> and moved it to patch titled "x86/boot: Avoid #VE during
>> boot for TDX platforms"
>> * Fixed commit log as per Dan's suggestion.
>> * Added inline get_trampoline_start_ip() to set start_ip.
>
> You also added a comment to tr_idt, but didn't mention it here, so I
> went to double check. Please take care to document all changes to the
> patch from the previous review.
Ok. I will make sure change log is current.
>
>>
>> arch/x86/boot/compressed/pgtable.h | 2 +-
>> arch/x86/include/asm/realmode.h | 10 +++++++
>> arch/x86/kernel/smpboot.c | 2 +-
>> arch/x86/realmode/rm/header.S | 1 +
>> arch/x86/realmode/rm/trampoline_64.S | 38 ++++++++++++++++++++++++
>> arch/x86/realmode/rm/trampoline_common.S | 7 ++++-
>> 6 files changed, 57 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/boot/compressed/pgtable.h b/arch/x86/boot/compressed/pgtable.h
>> index 6ff7e81b5628..cc9b2529a086 100644
>> --- a/arch/x86/boot/compressed/pgtable.h
>> +++ b/arch/x86/boot/compressed/pgtable.h
>> @@ -6,7 +6,7 @@
>> #define TRAMPOLINE_32BIT_PGTABLE_OFFSET 0
>>
>> #define TRAMPOLINE_32BIT_CODE_OFFSET PAGE_SIZE
>> -#define TRAMPOLINE_32BIT_CODE_SIZE 0x70
>> +#define TRAMPOLINE_32BIT_CODE_SIZE 0x80
>>
>> #define TRAMPOLINE_32BIT_STACK_END TRAMPOLINE_32BIT_SIZE
>>
>> diff --git a/arch/x86/include/asm/realmode.h b/arch/x86/include/asm/realmode.h
>> index 5db5d083c873..3328c8edb200 100644
>> --- a/arch/x86/include/asm/realmode.h
>> +++ b/arch/x86/include/asm/realmode.h
>> @@ -25,6 +25,7 @@ struct real_mode_header {
>> u32 sev_es_trampoline_start;
>> #endif
>> #ifdef CONFIG_X86_64
>> + u32 trampoline_start64;
>> u32 trampoline_pgd;
>> #endif
>> /* ACPI S3 wakeup */
>> @@ -88,6 +89,15 @@ static inline void set_real_mode_mem(phys_addr_t mem)
>> real_mode_header = (struct real_mode_header *) __va(mem);
>> }
>>
>> +static inline unsigned long get_trampoline_start_ip(void)
>
> I'd prefer this helper take a 'struct real_mode_header *rmh' as an
> argument rather than assume a global variable.
I am fine with it. But existing inline functions also directly read/writes
the real_mode_header. So I just followed the same format.
I will fix this in next version.
>
>> +{
>> +#ifdef CONFIG_X86_64
>> + if (is_tdx_guest())
>> + return real_mode_header->trampoline_start64;
>> +#endif
>> + return real_mode_header->trampoline_start;
>> +}
>> +
>> void reserve_real_mode(void);
>>
>> #endif /* __ASSEMBLY__ */
>> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
>> index 16703c35a944..0b4dff5e67a9 100644
>> --- a/arch/x86/kernel/smpboot.c
>> +++ b/arch/x86/kernel/smpboot.c
>> @@ -1031,7 +1031,7 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle,
>> int *cpu0_nmi_registered)
>> {
>> /* start_ip had better be page-aligned! */
>> - unsigned long start_ip = real_mode_header->trampoline_start;
>> + unsigned long start_ip = get_trampoline_start_ip();
>>
>> unsigned long boot_error = 0;
>> unsigned long timeout;
>> diff --git a/arch/x86/realmode/rm/header.S b/arch/x86/realmode/rm/header.S
>> index 8c1db5bf5d78..2eb62be6d256 100644
>> --- a/arch/x86/realmode/rm/header.S
>> +++ b/arch/x86/realmode/rm/header.S
>> @@ -24,6 +24,7 @@ SYM_DATA_START(real_mode_header)
>> .long pa_sev_es_trampoline_start
>> #endif
>> #ifdef CONFIG_X86_64
>> + .long pa_trampoline_start64
>> .long pa_trampoline_pgd;
>> #endif
>> /* ACPI S3 wakeup */
>> diff --git a/arch/x86/realmode/rm/trampoline_64.S b/arch/x86/realmode/rm/trampoline_64.S
>> index 84c5d1b33d10..754f8d2ac9e8 100644
>> --- a/arch/x86/realmode/rm/trampoline_64.S
>> +++ b/arch/x86/realmode/rm/trampoline_64.S
>> @@ -161,6 +161,19 @@ SYM_CODE_START(startup_32)
>> ljmpl $__KERNEL_CS, $pa_startup_64
>> SYM_CODE_END(startup_32)
>>
>> +SYM_CODE_START(pa_trampoline_compat)
>> + /*
>> + * In compatibility mode. Prep ESP and DX for startup_32, then disable
>> + * paging and complete the switch to legacy 32-bit mode.
>> + */
>> + movl $rm_stack_end, %esp
>> + movw $__KERNEL_DS, %dx
>> +
>> + movl $(X86_CR0_NE | X86_CR0_PE), %eax
>
> Before this patch the startup path did not touch X86_CR0_NE. I assume
> it was added opportunistically for the TDX case? If it is to stay in
> this patch it deserves a code comment / mention in the changelog, or
> it needs to move to the other patch that fixes up the CR0 setup for
> TDX.
I will move X86_CR0_NE related update to the patch that has other
X86_CR0_NE related updates.
>
>
>> + movl %eax, %cr0
>> + ljmpl $__KERNEL32_CS, $pa_startup_32
>> +SYM_CODE_END(pa_trampoline_compat)
>> +
>> .section ".text64","ax"
>> .code64
>> .balign 4
>> @@ -169,6 +182,20 @@ SYM_CODE_START(startup_64)
>> jmpq *tr_start(%rip)
>> SYM_CODE_END(startup_64)
>>
>> +SYM_CODE_START(trampoline_start64)
>> + /*
>> + * APs start here on a direct transfer from 64-bit BIOS with identity
>> + * mapped page tables. Load the kernel's GDT in order to gear down to
>> + * 32-bit mode (to handle 4-level vs. 5-level paging), and to (re)load
>> + * segment registers. Load the zero IDT so any fault triggers a
>> + * shutdown instead of jumping back into BIOS.
>> + */
>> + lidt tr_idt(%rip)
>> + lgdt tr_gdt64(%rip)
>> +
>> + ljmpl *tr_compat(%rip)
>> +SYM_CODE_END(trampoline_start64)
>> +
>> .section ".rodata","a"
>> # Duplicate the global descriptor table
>> # so the kernel can live anywhere
>> @@ -182,6 +209,17 @@ SYM_DATA_START(tr_gdt)
>> .quad 0x00cf93000000ffff # __KERNEL_DS
>> SYM_DATA_END_LABEL(tr_gdt, SYM_L_LOCAL, tr_gdt_end)
>>
>> +SYM_DATA_START(tr_gdt64)
>> + .short tr_gdt_end - tr_gdt - 1 # gdt limit
>> + .long pa_tr_gdt
>> + .long 0
>> +SYM_DATA_END(tr_gdt64)
>> +
>> +SYM_DATA_START(tr_compat)
>> + .long pa_trampoline_compat
>> + .short __KERNEL32_CS
>> +SYM_DATA_END(tr_compat)
>> +
>> .bss
>> .balign PAGE_SIZE
>> SYM_DATA(trampoline_pgd, .space PAGE_SIZE)
>> diff --git a/arch/x86/realmode/rm/trampoline_common.S b/arch/x86/realmode/rm/trampoline_common.S
>> index 5033e640f957..ade7db208e4e 100644
>> --- a/arch/x86/realmode/rm/trampoline_common.S
>> +++ b/arch/x86/realmode/rm/trampoline_common.S
>> @@ -1,4 +1,9 @@
>> /* SPDX-License-Identifier: GPL-2.0 */
>> .section ".rodata","a"
>> .balign 16
>> -SYM_DATA_LOCAL(tr_idt, .fill 1, 6, 0)
>> +
>> +/* .fill cannot be used for size > 8. So use short and quad */
>
> If there is to be a comment here it should be to clarify why @tr_idt
> is 10 bytes, not necessarily a quirk of the assembler.
Got it. I will fix the comment or remove it.
>
>> +SYM_DATA_START_LOCAL(tr_idt)
>
> The .fill restriction is only for @size, not @repeat. So, what's wrong
> with SYM_DATA_LOCAL(tr_idt, .fill 2, 5, 0)?
Any reason to prefer above change over previous code ?
SYM_DATA_START_LOCAL(tr_idt)
.short 0
.quad 0
SYM_DATA_END(tr_idt)
>
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
Powered by blists - more mailing lists