lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj1kXHS0rr9DfKCeD-Zz7y1Bk-3ncn2cEgVmnWE0Jq1B=+Acg@mail.gmail.com>
Date: Sun, 21 Jul 2024 15:26:27 +0200
From: Ard Biesheuvel <ardb@...nel.org>
To: "Shao, Marshall" <Marshall.Shao@...l.com>
Cc: "linux-efi@...r.kernel.org" <linux-efi@...r.kernel.org>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "hpa@...or.com" <hpa@...or.com>, 
	"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>, "bp@...en8.de" <bp@...en8.de>, 
	"mingo@...hat.com" <mingo@...hat.com>, "tglx@...utronix.de" <tglx@...utronix.de>, 
	"Mishra, Ashish" <Ashish.Mishra4@...l.com>, "Chia, Jia Yuan" <JiaYuan.Chia@...l.com>, 
	"Dion, Christopher" <Christopher.Dion@...l.com>, "Caisse, Joe" <Joe.Caisse@...l.com>, 
	"Mukundan, Govind" <Govind.Mukundan@...l.com>
Subject: Re: [Patch] Do not clear BSS region in x86 stub

On Sat, 20 Jul 2024 at 17:40, Shao, Marshall <Marshall.Shao@...l.com> wrote:
>
> Hi Ard,
>
> Thanks for your reply. Here is the problem I encountered, and the problem
> still exists in the mainline:
>
> Firmware: UEFI 2.4,
> EFI loader: systemd-boot
>
> When I update my kernel from LTS v6.1.x to v6.6.x, the kernel fails to
>  boot when it jumps from the handover protocol. And it fails only on an
> old Firmware that MemoryOverwriteRequestControl(MOR) bit is set to 1.
>
> According to the TCG Reset Attack Mitigation Spec, a Memory Clear Method
> will be loaded in the memory on boot, and this causes the EFI loader
> and OS memory address to shift. Another important factor is that my
> BSS region and boot_idt entries are not as clean as those in QEMU and
> other platforms.
>
> This means that clearing BSS in the handover function can cause the
> firmware's IDT/GDT corruption, as you mentioned here commit <7b8439b0369b>

This commit ID does not exist, I take it you mean
ebf5a79acf9a2970e93d30a9e97b08913ef15711

> (x86/efistub: Drop redundant clearing of BSS).

systemd-boot does not use the handover entrypoint, it uses the native
PE entrypoint, which no longer clears BSS with the commit above
applied.

> I noticed that most of
> variables in BSS will be initialized before accessing, and only a few are
> not, that's why I removed memset part in handover entry point.
>

It would be better not to rely on special semantics that deviate from
standard C. That said, it would also be better not to rely on this
funky EFI handover protocol in the first place, which is only
implemented by downstream, distro versions of GRUB. Given that GRUB
now supports the native EFI entrypoint properly, the handover protocol
is essentially deprecated.

This does not mean, of course, that we should stop supporting it. But
removing the memset() there may break things in a way that only
becomes apparent once the changes trickle down to systems running
those older GRUBs.

> In terms of the BOOT_STACK_SIZE, I found out that the boot_stack region is
> used in this case (at the beginning of _bss), and 0x4000 is not sufficient
> to cover this issue since the data in boot_heap will be overwritten during
> decompression.

EFI boot uses the EFI firmware stack, and never enters the
decompressor, where the address of boot_stack_end is loaded into
ESP/RSP.

This means that increasing the boot stack size does not prevent an
overrun, but is affecting the image layout in a different way,
avoiding your boot issue.

It would be nice to get to the bottom of this: can you double check
that the PE image metadata is accurate? llvm-readelf -a can be used to
dump the PE header of a bzImage.

> Prior to image decompression in EFI stub, it was ok.
>
> 00000000008fc000 g       .bss   0000000000000000 .hidden _bss
> 00000000008fc000 l     O .bss   0000000000004000 boot_stack
> 0000000000900000 g     O .bss   0000000000000004 .hidden spurious_nmi_count
> 0000000000900000 l     O .bss   0000000000000000 boot_stack_end
> 0000000000900010 g     O .bss   0000000000000018 .hidden pio_ops
> 0000000000900028 g     O .bss   0000000000000008 .hidden boot_params_ptr
> 0000000000900040 l     O .bss   0000000000001000 scratch.0
> 0000000000901040 l     O .bss   0000000000010000 boot_heap
>
> My thought was that increasing the size of boot_stack should be harmless.

I'm not bothered by the change itself, I just want to make sure that
the fix is guaranteed to address the problem rather than paper over
it.

> As for the memset part, there could be an alternative way, althought it
> looks a bit ugly.
>
> memset(_bss+0x10000, 0, _ebss - _bss - 0x10000)
>

So now you are applying the memset only to part of BSS, right? How
does this help?


> However, I've updated the diff, if you think this is more like a
> workaround, please take this thread as a bug report. Thanks!
>
> Regards,
>
> Marshall Shao
>
>
> Signed-off-by: Marshall Shao <marshall.shao@...l.com>
> ---
>  arch/x86/boot/compressed/misc.c         | 4 ++--
>  arch/x86/include/asm/boot.h             | 2 +-
>  drivers/firmware/efi/libstub/x86-5lvl.c | 2 +-
>  drivers/firmware/efi/libstub/x86-stub.c | 2 --
>  include/linux/decompress/mm.h           | 2 +-
>  5 files changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> index 944454306ef4..49b68f57dd18 100644
> --- a/arch/x86/boot/compressed/misc.c
> +++ b/arch/x86/boot/compressed/misc.c
> @@ -50,8 +50,8 @@ struct boot_params *boot_params_ptr;
>
>  struct port_io_ops pio_ops;
>
> -memptr free_mem_ptr;
> -memptr free_mem_end_ptr;
> +memptr free_mem_ptr __section(".data");
> +memptr free_mem_end_ptr  __section(".data");
>  int spurious_nmi_count;
>
>  static char *vidmem;
> diff --git a/arch/x86/include/asm/boot.h b/arch/x86/include/asm/boot.h
> index 3e5b111e619d..312bc87ab027 100644
> --- a/arch/x86/include/asm/boot.h
> +++ b/arch/x86/include/asm/boot.h
> @@ -33,7 +33,7 @@
>  #endif
>
>  #ifdef CONFIG_X86_64
> -# define BOOT_STACK_SIZE       0x4000
> +# define BOOT_STACK_SIZE       0x10000
>
>  /*
>   * Used by decompressor's startup_32() to allocate page tables for identity
> diff --git a/drivers/firmware/efi/libstub/x86-5lvl.c b/drivers/firmware/efi/libstub/x86-5lvl.c
> index 77359e802181..bebae4fdfb93 100644
> --- a/drivers/firmware/efi/libstub/x86-5lvl.c
> +++ b/drivers/firmware/efi/libstub/x86-5lvl.c
> @@ -10,7 +10,7 @@
>
>  bool efi_no5lvl;
>
> -static void (*la57_toggle)(void *cr3);
> +static void (*la57_toggle)(void *cr3) __section(".data");
>
>  static const struct desc_struct gdt[] = {
>         [GDT_ENTRY_KERNEL32_CS] = GDT_ENTRY_INIT(DESC_CODE32, 0, 0xfffff),
> diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c
> index 078055b054e3..ffc62af50669 100644
> --- a/drivers/firmware/efi/libstub/x86-stub.c
> +++ b/drivers/firmware/efi/libstub/x86-stub.c
> @@ -21,7 +21,6 @@
>  #include "efistub.h"
>  #include "x86-stub.h"
>
> -extern char _bss[], _ebss[];
>
>  const efi_system_table_t *efi_system_table;
>  const efi_dxe_services_table_t *efi_dxe_table;
> @@ -1059,7 +1058,6 @@ void __noreturn efi_stub_entry(efi_handle_t handle,
>  void efi_handover_entry(efi_handle_t handle, efi_system_table_t *sys_table_arg,
>                         struct boot_params *boot_params)
>  {
> -       memset(_bss, 0, _ebss - _bss);
>         efi_stub_entry(handle, sys_table_arg, boot_params);
>  }
>
> diff --git a/include/linux/decompress/mm.h b/include/linux/decompress/mm.h
> index ac862422df15..62c04691c898 100644
> --- a/include/linux/decompress/mm.h
> +++ b/include/linux/decompress/mm.h
> @@ -36,7 +36,7 @@
>  /* A trivial malloc implementation, adapted from
>   *  malloc by Hannu Savolainen 1993 and Matthias Urlichs 1994
>   */
> -STATIC_RW_DATA unsigned long malloc_ptr;
> +STATIC_RW_DATA unsigned long malloc_ptr  __section(".data");
>  STATIC_RW_DATA int malloc_count;
>
>  MALLOC_VISIBLE void *malloc(int size)
> --
> 2.34.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ