lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAA0N8zb0ddnDpZdaOeb0G2azd-jPu3qtxWPQgaWC24f6VgqY6Q@mail.gmail.com>
Date: Fri, 15 Dec 2023 13:38:34 -0800
From: Chris Koch <chrisko@...gle.com>
To: Dave Hansen <dave.hansen@...el.com>
Cc: Jonathan Corbet <corbet@....net>, Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, 
	Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>, 
	"H. Peter Anvin" <hpa@...or.com>, linux-doc@...r.kernel.org, x86@...nel.org, 
	linux-kernel@...r.kernel.org, Cloud Hsu <cloudhsu@...gle.com>
Subject: Re: [PATCH] kexec: allocate kernel above bzImage's pref_address

On Fri, Dec 15, 2023 at 1:17 PM Dave Hansen <dave.hansen@...el.com> wrote:
>
> On 12/15/23 11:05, Chris Koch wrote:
> > A relocatable kernel will relocate itself to pref_address if it is
> > loaded below pref_address. This means a booted kernel may be relocating
> > itself to an area with reserved memory on modern systems, potentially
> > clobbering arbitrary data that may be important to the system.
> >
> > This is often the case, as the default value of PHYSICAL_START is
> > 0x1000000 and kernels are typically loaded at 0x100000 or above by
> > bootloaders like iPXE or kexec. GRUB behaves like this patch does.
> >
> > Also fixes the documentation around pref_address and PHYSICAL_START to
> > be accurate.
>
> Are you reporting a bug and is this a bug fix?  It's not super clear
> from the changelog.

I reported it as a bug yesterday in
https://lkml.org/lkml/2023/12/14/1529 -- I'm happy to reword this in
some way that indicates it's a bug fix.

>
>
> > diff --git a/Documentation/arch/x86/boot.rst b/Documentation/arch/x86/boot.rst
> > index 22cc7a040dae..49bea8986620 100644
> > --- a/Documentation/arch/x86/boot.rst
> > +++ b/Documentation/arch/x86/boot.rst
> > @@ -878,7 +878,8 @@ Protocol: 2.10+
> >    address if possible.
> >
> >    A non-relocatable kernel will unconditionally move itself and to run
> > -  at this address.
> > +  at this address. A relocatable kernel will move itself to this address if it
> > +  loaded below this address.
>
> I think we should avoid saying the same things over and over again in
> different spots.
>
> Here, it doesn't really help to enumerate the different interpretations
> of 'pref_address'.  All that matters is that the bootloader can avoid
> the overhead of a later copy if it can place the kernel at
> 'pref_address'.  The exact reasons that various kernels might decide to
> relocate are unimportant here.

I think it's important documentation for bootloader authors. It's not
about avoiding overhead, it's about avoiding clobbering areas of
memory that may be reserved in e820 / EFI memory map, which the kernel
will do when it relocates itself to pref_address without checking
what's reserved and what's not. It emphasizes the importance of
choosing an address above pref_address. Happy to reword some way to
reflect that.

>
> >  ============ =======
> >  Field name:  init_size
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index 3762f41bb092..1370f43328d7 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -2109,11 +2109,11 @@ config PHYSICAL_START
> >       help
> >         This gives the physical address where the kernel is loaded.
> >
> > -       If kernel is a not relocatable (CONFIG_RELOCATABLE=n) then
> > -       bzImage will decompress itself to above physical address and
> > -       run from there. Otherwise, bzImage will run from the address where
> > -       it has been loaded by the boot loader and will ignore above physical
> > -       address.
> > +       If the kernel is not relocatable (CONFIG_RELOCATABLE=n) then bzImage
> > +       will decompress itself to above physical address and run from there.
> > +       Otherwise, bzImage will run from the address where it has been loaded
> > +       by the boot loader. The only exception is if it is loaded below the
> > +       above physical address, in which case it will relocate itself there.
>
> I kinda dislike how this is written.  It's written almost like code
> where you're spelling out the conditions.  I prefer something much
> higher-level.
>
>         This gives a minimum physical address at which the kernel can be
>         loaded.
>
>         CONFIG_RELOCATABLE=n kernels will be decompressed to and must
>         run at PHYSICAL_START exactly.
>
>         CONFIG_RELOCATABLE=y kernels can run at any address above
>         PHYSICAL_START.  If a kernel is loaded below PHYSICAL_START, it
>         will relocate itself to PHYSICAL_START.

Happy to change that, yours is better.

>
> >         In normal kdump cases one does not have to set/change this option
> >         as now bzImage can be compiled as a completely relocatable image
> > diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
> > index a61c12c01270..5dcd232d58bf 100644
> > --- a/arch/x86/kernel/kexec-bzimage64.c
> > +++ b/arch/x86/kernel/kexec-bzimage64.c
> > @@ -498,7 +498,10 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
> >       kbuf.bufsz =  kernel_len - kern16_size;
> >       kbuf.memsz = PAGE_ALIGN(header->init_size);
> >       kbuf.buf_align = header->kernel_alignment;
> > -     kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
> > +     if (header->pref_address < MIN_KERNEL_LOAD_ADDR)
> > +             kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
> > +     else
> > +             kbuf.buf_min = header->pref_address;
> >       kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
> >       ret = kexec_add_buffer(&kbuf);
> >       if (ret)
>
> Comment, please.
>
> It isn't clear from this hunk why or how this fixes the bug.  How does
> this manage to avoid clobbering reserved areas?

When allocated above pref_address, the kernel will not relocate itself
to an area that potentially overlaps with reserved memory. I'll add a
comment.

Not sure what the etiquette is on immediately sending a patch v2, or
waiting for more comments. I'll err on waiting on a couple more
comments before sending v2. Thanks for the review

Chris

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ