[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALu+AoSKpgbbKmsL8iuWpQB2ANqnhhfXR5pN5m0EsKZeFUBPkw@mail.gmail.com>
Date: Mon, 11 Sep 2023 22:56:36 +0800
From: Dave Young <dyoung@...hat.com>
To: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc: Ard Biesheuvel <ardb@...gle.com>,
Kees Cook <keescook@...omium.org>,
Aaron Lu <aaron.lu@...el.com>,
Bagas Sanjaya <bagasdotme@...il.com>,
Borislav Petkov <bp@...en8.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux Regressions <regressions@...ts.linux.dev>,
kexec@...ts.infradead.org
Subject: Re: kexec reboot failed due to commit 75d090fd167ac
Add kexec list in cc
On Sat, 9 Sept 2023 at 19:34, Kirill A. Shutemov
<kirill.shutemov@...ux.intel.com> wrote:
>
> On Fri, Sep 08, 2023 at 06:17:53PM +0200, Ard Biesheuvel wrote:
> > On Fri, Sep 8, 2023 at 5:58 PM Kees Cook <keescook@...omium.org> wrote:
> > >
> > > On Fri, Sep 08, 2023 at 03:32:33PM +0300, Kirill A. Shutemov wrote:
> > > > On Fri, Sep 08, 2023 at 02:02:30PM +0800, Aaron Lu wrote:
> > > > > On Thu, Sep 07, 2023 at 04:14:09PM +0300, Kirill A. Shutemov wrote:
> > > > > > On Tue, Aug 29, 2023 at 10:04:51PM +0800, Aaron Lu wrote:
> > > > > > > > Could you show dmesg of the first kernel before kexec?
> > > > > > >
> > > > > > > Attached.
> > > > > > >
> > > > > > > BTW, kexec is invoked like this:
> > > > > > > kver=6.4.0-rc5-00009-g75d090fd167a
> > > > > > > kdir=$HOME/kernels/$kver
> > > > > > > sudo kexec -l $kdir/vmlinuz-$kver --initrd=$kdir/initramfs-$kver.img --append="root=UUID=4381321e-e01e-455a-9d46-5e8c4c5b2d02 ro net.ifnames=0 acpi_rsdp=0x728e8014 no_hash_pointers sched_verbose selinux=0"
> > > > > >
> > > > > > I don't understand why it happens.
> > > > > >
> > > > > > Could you check if this patch changes anything:
> > > > > >
> > > > > > diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> > > > > > index 94b7abcf624b..172c476ff6f3 100644
> > > > > > --- a/arch/x86/boot/compressed/misc.c
> > > > > > +++ b/arch/x86/boot/compressed/misc.c
> > > > > > @@ -456,10 +456,12 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
> > > > > >
> > > > > > debug_putstr("\nDecompressing Linux... ");
> > > > > >
> > > > > > +#if 0
> > > > > > if (init_unaccepted_memory()) {
> > > > > > debug_putstr("Accepting memory... ");
> > > > > > accept_memory(__pa(output), __pa(output) + needed_size);
> > > > > > }
> > > > > > +#endif
> > > > > >
> > > > > > __decompress(input_data, input_len, NULL, NULL, output, output_len,
> > > > > > NULL, error);
> > > > > > --
> > > > >
> > > > > It solved the problem.
> > > >
> > > > Looks like increasing BOOT_INIT_PGT_SIZE fixes the issue. I don't yet
> > > > understand why and how unaccepted memory is involved. I will look more
> > > > into it.
> > > >
> > > > Enabling CONFIG_RANDOMIZE_BASE also makes the issue go away.
> > >
> > > Is this perhaps just luck? I.e. does is break ever on, say, 1000 boot
> > > attempts? (i.e. maybe some position is bad and KASLR happens to usually
> > > avoid it?)
>
> Yes, it can be luck.
>
> > > > Kees, maybe you have a clue?
> > >
> > > The only thing I can think of is that something isn't being counted
> > > correctly due to the size of code, and it just happens that this commit
> > > makes the code large enough to exceed some set of mappings?
> > >
> > > >
> > > > diff --git a/arch/x86/include/asm/boot.h b/arch/x86/include/asm/boot.h
> > > > index 9191280d9ea3..26ccce41d781 100644
> > > > --- a/arch/x86/include/asm/boot.h
> > > > +++ b/arch/x86/include/asm/boot.h
> > > > @@ -40,7 +40,7 @@
> > > > #ifdef CONFIG_X86_64
> > > > # define BOOT_STACK_SIZE 0x4000
> > > >
> > > > -# define BOOT_INIT_PGT_SIZE (6*4096)
> > > > +# define BOOT_INIT_PGT_SIZE (7*4096)
> > >
> > > That's why this might be working, for example? How large is the boot
> > > image before/after the commit, etc?
> > >
> >
> > Not sure why these changes would make a difference here, but choking
> > on accept_memory() on a non-TDX suggests that init_unaccepted_memory()
> > is poking into unmapped memory before it even decides that the
> > unaccepted memory does not exist.
> >
> > init_unaccepted_memory() has
> >
> > ret = efi_get_conf_table(boot_params, &cfg_table_pa, &cfg_table_len);
> > if (ret) {
> > warn("EFI config table not found.");
> > return false;
> > }
> >
> > which looks for <guid, phys_addr> tuples in an array pointed to by the
> > EFI system table, and if either of those is not mapped, things can be
> > expected to explode.
> >
> > The only odd thing there is that this code is invoked after setting up
> > the 'demand paging' logic in the decompressor.
> >
> > If you haven't yet, could you please retry the kexec boot with
> > earlyprintk=tty<insert your UART params here>?
>
> early console in extract_kernel
> input_data: 0x000000807eb433a8
> input_len: 0x0000000000d26271
> output: 0x000000807b000000
> output_len: 0x0000000004800c10
> kernel_total_size: 0x0000000003e28000
> needed_size: 0x0000000004a00000
> trampoline_32bit: 0x000000000009d000
>
> Decompressing Linux... out of pgt_buf in arch/x86/boot/compressed/ident_map_64.c!?
> pages->pgt_buf_offset: 0x0000000000006000
> pages->pgt_buf_size: 0x0000000000006000
>
>
> Error: kernel_ident_mapping_init() failed
>
> It crashes on #PF due to stbl->nr_tables dereference in
> efi_get_conf_table() called from init_unaccepted_memory().
>
> I don't see anything special about stbl location: 0x775d6018.
>
> One other bit of information: disabling 5-level paging also helps the
> issue.
>
> I will debug further.
>
> --
> Kiryl Shutsemau / Kirill A. Shutemov
>
Powered by blists - more mailing lists