lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zl4ASZfWMkvAcJXO@kernel.org>
Date: Mon, 3 Jun 2024 20:41:29 +0300
From: Mike Rapoport <rppt@...nel.org>
To: "Kalra, Ashish" <ashish.kalra@....com>
Cc: Borislav Petkov <bp@...en8.de>, tglx@...utronix.de, mingo@...hat.com,
	dave.hansen@...ux.intel.com, x86@...nel.org, rafael@...nel.org,
	hpa@...or.com, peterz@...radead.org, adrian.hunter@...el.com,
	sathyanarayanan.kuppuswamy@...ux.intel.com, jun.nakajima@...el.com,
	rick.p.edgecombe@...el.com, thomas.lendacky@....com,
	michael.roth@....com, seanjc@...gle.com, kai.huang@...el.com,
	bhe@...hat.com, kirill.shutemov@...ux.intel.com, bdas@...hat.com,
	vkuznets@...hat.com, dionnaglaze@...gle.com, anisinha@...hat.com,
	jroedel@...e.de, ardb@...nel.org, kexec@...ts.infradead.org,
	linux-coco@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec

On Mon, Jun 03, 2024 at 11:56:01AM -0500, Kalra, Ashish wrote:
> On 6/3/2024 10:29 AM, Mike Rapoport wrote:
> 
> > On Mon, Jun 03, 2024 at 09:01:49AM -0500, Kalra, Ashish wrote:
> > > On 6/3/2024 8:39 AM, Mike Rapoport wrote:
> > > 
> > > > On Mon, Jun 03, 2024 at 08:06:56AM -0500, Kalra, Ashish wrote:
> > > > > On 6/3/2024 3:56 AM, Borislav Petkov wrote
> > > > > 
> > > > > > > EFI memory map and due to early allocation it uses memblock allocation.
> > > > > > > 
> > > > > > > Later during boot, efi_enter_virtual_mode() calls kexec_enter_virtual_mode()
> > > > > > > in case of a kexec-ed kernel boot.
> > > > > > > 
> > > > > > > This function kexec_enter_virtual_mode() installs the new EFI memory map by
> > > > > > > calling efi_memmap_init_late() which remaps the efi_memmap physically allocated
> > > > > > > in efi_arch_mem_reserve(), but this remapping is still using memblock allocation.
> > > > > > > 
> > > > > > > Subsequently, when memblock is freed later in boot flow, this remapped
> > > > > > > efi_memmap will have random corruption (similar to a use-after-free scenario).
> > > > > > > 
> > > > > > > The corrupted EFI memory map is then passed to the next kexec-ed kernel
> > > > > > > which causes a panic when trying to use the corrupted EFI memory map.
> > > > > > This sounds fishy: memblock allocated memory is not freed later in the
> > > > > > boot - it remains reserved. Only free memory is freed from memblock to
> > > > > > the buddy allocator.
> > > > > > 
> > > > > > Or is the problem that memblock-allocated memory cannot be memremapped
> > > > > > because *raisins*?
> > > > > This is what seems to be happening:
> > > > > 
> > > > > efi_arch_mem_reserve() calls efi_memmap_alloc() to allocate memory for
> > > > > EFI memory map and due to early allocation it uses memblock allocation.
> > > > > 
> > > > > And later efi_enter_virtual_mode() calls kexec_enter_virtual_mode()
> > > > > in case of a kexec-ed kernel boot.
> > > > > 
> > > > > This function kexec_enter_virtual_mode() installs the new EFI memory map by
> > > > > calling efi_memmap_init_late() which does memremap() on memblock-allocated memory.
> > > > Does the issue happen only with SNP?
> > > This is observed under SNP as efi_arch_mem_reserve() is only being called
> > > with SNP enabled and then efi_arch_mem_reserve() allocates EFI memory map
> > > using memblock.
> > I don't see how efi_arch_mem_reserve() is only called with SNP. What did I
> > miss?
> 
> This is the call stack for efi_arch_mem_reserve():
> 
> [    0.310010]  efi_arch_mem_reserve+0xb1/0x220
> [    0.311382]  efi_mem_reserve+0x36/0x60
> [    0.311973]  efi_bgrt_init+0x17d/0x1a0
> [    0.313265]  acpi_parse_bgrt+0x12/0x20
> [    0.313858]  acpi_table_parse+0x77/0xd0
> [    0.314463]  acpi_boot_init+0x362/0x630
> [    0.315069]  setup_arch+0xa88/0xf80
> [    0.315629]  start_kernel+0x68/0xa90
> [    0.316194]  x86_64_start_reservations+0x1c/0x30
> [    0.316921]  x86_64_start_kernel+0xbf/0x110
> [    0.317582]  common_startup_64+0x13e/0x141
> 
> So, probably it is being invoked specifically for AMD platform ?

AFAIU, efi_bgrt_init() can be called for any x86 platform, with or without
encryption. 
So if my understating is correct, efi_arch_mem_reserve() will be called with SNP
disabled as well. And if kexec works ok without SNP but fails with SNP this
may give as a clue to the root cause of the failure.
 
> > > If we skip efi_arch_mem_reserve() (which should probably be anyway skipped
> > > for kexec case), then for kexec boot, EFI memmap is memremapped in the same
> > > virtual address as the first kernel and not the allocated memblock address.
> > Maybe we should skip efi_arch_mem_reserve() for kexec case, but I think we
> > still need to understand what's causing memory corruption.
> 
> When, efi_arch_mem_reserve() allocates memory for EFI memory map using
> memblock and then later in boot, kexec_enter_virtual_mode() does memremap on
> this memblock allocated memory, subsequently after this i see EFI memory map
> corruption, so are there are any issues doing memremap on memblock-allocated
> memory ?

memblock-allocated memory is just RAM, so my take is that memremap() cannot
figure out the encryption bits properly.

You can check if there are issues with memrmapp()ing memblock-allocated
memory by sticking memblock_phys_alloc() somewhere, filling that memory with a
pattern and then calling memremap(addr, size, MEMREMAP_WB) and checking if
the pattern is still there.
 
> Thanks, Ashish
> 
> > > > I didn't really dig, but my theory would be that it has something to do
> > > > with arch_memremap_can_ram_remap() in arch/x86/mm/ioremap.c
> > > > > Thanks, Ashish

-- 
Sincerely yours,
Mike.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ