[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CALu+AoRSQpi7M9CFOS9BFmiszrRcvNucEicti_NuX2oSnObS0w@mail.gmail.com>
Date: Fri, 30 Aug 2024 16:49:07 +0800
From: Dave Young <dyoung@...hat.com>
To: Tom Lendacky <thomas.lendacky@....com>
Cc: Baoquan He <bhe@...hat.com>, linux-kernel@...r.kernel.org, noodles@...com,
x86@...nel.org, lijiang@...hat.com, kexec@...ts.infradead.org
Subject: Re: [PATCH] x86/mm/sme: fix the kdump kernel breakage on SME system
when CONFIG_IMA_KEXEC=y
On Mon, 26 Aug 2024 at 22:24, Tom Lendacky <thomas.lendacky@....com> wrote:
>
> On 8/25/24 21:44, Baoquan He wrote:
> > Recently, it's reported that kdump kernel is broken during bootup on
> > SME system when CONFIG_IMA_KEXEC=y. When debugging, I noticed this
> > can be traced back to commit ("b69a2afd5afc x86/kexec: Carry forward
> > IMA measurement log on kexec"). Just nobody ever tested it on SME
> > system when enabling CONFIG_IMA_KEXEC.
> >
> > --------------------------------------------------
> > ima: No TPM chip found, activating TPM-bypass!
> > Loading compiled-in module X.509 certificates
> > Loaded X.509 cert 'Build time autogenerated kernel key: 18ae0bc7e79b64700122bb1d6a904b070fef2656'
> > ima: Allocated hash algorithm: sha256
> > Oops: general protection fault, probably for non-canonical address 0xcfacfdfe6660003e: 0000 [#1] PREEMPT SMP NOPTI
> > CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc2+ #14
> > Hardware name: Dell Inc. PowerEdge R7425/02MJ3T, BIOS 1.20.0 05/03/2023
> > RIP: 0010:ima_restore_measurement_list+0xdc/0x420
> > Code: ff 48 c7 85 10 ff ff ff 00 00 00 00 48 c7 85 18 ff ff ff 00 00 00 00 48 85 f6 0f 84 09 03 00 00 48 83 fa 17 0f 86 ff 02 00 00 <66> 83 3e 01 49 89 f4 0f 85 90 94 7d 00 48 83 7e 10 ff 0f 84 74 94
> > RSP: 0018:ffffc90000053c80 EFLAGS: 00010286
> > RAX: 0000000000000000 RBX: ffffc90000053d03 RCX: 0000000000000000
> > RDX: e48066052d5df359 RSI: cfacfdfe6660003e RDI: cfacfdfe66600056
> > RBP: ffffc90000053d80 R08: 0000000000000000 R09: ffffffff82de1a88
> > R10: ffffc90000053da0 R11: 0000000000000003 R12: 00000000000001a4
> > R13: ffffc90000053df0 R14: 0000000000000000 R15: 0000000000000000
> > FS: 0000000000000000(0000) GS:ffff888040200000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007f2c744050e8 CR3: 000080004110e000 CR4: 00000000003506b0
> > Call Trace:
> > <TASK>
> > ? show_trace_log_lvl+0x1b0/0x2f0
> > ? show_trace_log_lvl+0x1b0/0x2f0
> > ? ima_load_kexec_buffer+0x6e/0xf0
> > ? __die_body.cold+0x8/0x12
> > ? die_addr+0x3c/0x60
> > ? exc_general_protection+0x178/0x410
> > ? asm_exc_general_protection+0x26/0x30
> > ? ima_restore_measurement_list+0xdc/0x420
> > ? vprintk_emit+0x1f0/0x270
> > ? ima_load_kexec_buffer+0x6e/0xf0
> > ima_load_kexec_buffer+0x6e/0xf0
> > ima_init+0x52/0xb0
> > ? __pfx_init_ima+0x10/0x10
> > init_ima+0x26/0xc0
> > ? __pfx_init_ima+0x10/0x10
> > do_one_initcall+0x5b/0x300
> > do_initcalls+0xdf/0x100
> > ? __pfx_kernel_init+0x10/0x10
> > kernel_init_freeable+0x147/0x1a0
> > kernel_init+0x1a/0x140
> > ret_from_fork+0x34/0x50
> > ? __pfx_kernel_init+0x10/0x10
> > ret_from_fork_asm+0x1a/0x30
> > </TASK>
> > Modules linked in:
> > ---[ end trace 0000000000000000 ]---
> > RIP: 0010:ima_restore_measurement_list+0xdc/0x420
> > Code: ff 48 c7 85 10 ff ff ff 00 00 00 00 48 c7 85 18 ff ff ff 00 00 00 00 48 85 f6 0f 84 09 03 00 00 48 83 fa 17 0f 86 ff 02 00 00 <66> 83 3e 01 49 89 f4 0f 85 90 94 7d 00 48 83 7e 10 ff 0f 84 74 94
> > RSP: 0018:ffffc90000053c80 EFLAGS: 00010286
> > RAX: 0000000000000000 RBX: ffffc90000053d03 RCX: 0000000000000000
> > RDX: e48066052d5df359 RSI: cfacfdfe6660003e RDI: cfacfdfe66600056
> > RBP: ffffc90000053d80 R08: 0000000000000000 R09: ffffffff82de1a88
> > R10: ffffc90000053da0 R11: 0000000000000003 R12: 00000000000001a4
> > R13: ffffc90000053df0 R14: 0000000000000000 R15: 0000000000000000
> > FS: 0000000000000000(0000) GS:ffff888040200000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007f2c744050e8 CR3: 000080004110e000 CR4: 00000000003506b0
> > Kernel panic - not syncing: Fatal exception
> > Kernel Offset: disabled
> > Rebooting in 10 seconds..
> >
> > From debugging printing, the stored addr and size of ima_kexec buffer
> > are not decrypted correctly like:
> > ------
> > ima: ima_load_kexec_buffer, buffer:0xcfacfdfe6660003e, size:0xe48066052d5df359
> > ------
> >
> > There are three pieces of setup_data info passed to kexec/kdump kernel:
> > SETUP_EFI, SETUP_IMA and SETUP_RNG_SEED. However, among them, only
> > ima_kexec buffer suffered from the incorrect decryption. After
> > debugging, it's because of the code bug in early_memremap_is_setup_data()
> > where checking the embedded content inside setup_data takes wrong range
> > calculation.
> >
> > The length of efi data, rng_seed and ima_kexec are 0x70, 0x20, 0x10,
> > and the length of setup_data is 0x10. When checking if data is inside
> > the embedded conent of setup_data, the starting address of efi data and
> > rng_seed happened to land in the wrong calculated range. While the
> > ima_kexec's starting address unluckily doesn't pass the checking, then
> > error occurred.
> >
> > Here fix the code bug to make kexec/kdump kernel boot up successfully.
> >
> > Fixes: 8f716c9b5feb ("x86/mm: Add support to access boot related data in the clear")
>
> The check that was modified was added by:
> b3c72fc9a78e ("x86/boot: Introduce setup_indirect")
>
> The SETUP_INDIRECT patches seem to be the issue here.
>
> > Signed-off-by: Baoquan He <bhe@...hat.com>
> > ---
> > arch/x86/mm/ioremap.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> > index aa7d279321ea..7953c4a1d28d 100644
> > --- a/arch/x86/mm/ioremap.c
> > +++ b/arch/x86/mm/ioremap.c
> > @@ -717,7 +717,7 @@ static bool __init early_memremap_is_setup_data(resource_size_t phys_addr,
> > paddr_next = data->next;
> > len = data->len;
> >
> > - if ((phys_addr > paddr) && (phys_addr < (paddr + len))) {
> > + if ((phys_addr > paddr) && (phys_addr < (paddr + size + len))) {
>
Tom, I have an old laptop with AMD cpu, according to dmesg below I
assume SME is enabled:
[ 0.082438] Memory Encryption Features active: AMD SME
I tried kexec with mem_encrypt=on, but I can not reproduce Baoquan's
problem, it seems either ENC or NOENC works, it is weird to me. Any
idea about this?
Otherwise I tried earlycon=efi, early printk does not work. It works
with "NOENC" mapping though:
@@ -59,7 +62,8 @@ static __ref void *efi_earlycon_map(unsigned long
start, unsigned long len)
if (efi_fb)
return efi_fb + start;
- fb_prot = fb_wb ? PAGE_KERNEL : pgprot_writecombine(PAGE_KERNEL);
+ //fb_prot = fb_wb ? PAGE_KERNEL : pgprot_writecombine(PAGE_KERNEL);
+ fb_prot = fb_wb ? PAGE_KERNEL_NOENC :
pgprot_writecombine(PAGE_KERNEL_NOENC);
return early_memremap_prot(fb_base + start, len, pgprot_val(fb_prot));
}
Powered by blists - more mailing lists