linux-kernel - Re: [PATCH qemu] x86: don't let decompressed kernel image clobber setup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHmME9oPUJemVRvO3HX0q4BJGTFuzbLYANeizuRcNq2=Ykk1Gg@mail.gmail.com>
Date:   Fri, 30 Dec 2022 18:07:24 +0100
From:   "Jason A. Donenfeld" <Jason@...c4.com>
To:     Borislav Petkov <bp@...en8.de>
Cc:     "H. Peter Anvin" <hpa@...or.com>, pbonzini@...hat.com,
        ebiggers@...nel.org, x86@...nel.org, linux-kernel@...r.kernel.org,
        qemu-devel@...gnu.org, ardb@...nel.org, kraxel@...hat.com,
        philmd@...aro.org
Subject: Re: [PATCH qemu] x86: don't let decompressed kernel image clobber setup_data

On Fri, Dec 30, 2022 at 6:01 PM Borislav Petkov <bp@...en8.de> wrote:
>
> On Fri, Dec 30, 2022 at 04:54:27PM +0100, Jason A. Donenfeld wrote:
> > > Right, with CONFIG_X86_VERBOSE_BOOTUP=y in a guest here, it says:
> > >
> > > early console in extract_kernel
> > > input_data: 0x000000000be073a8
> > > input_len: 0x00000000008cfc43
> > > output: 0x0000000001000000
> > > output_len: 0x000000000b600a98
> > > kernel_total_size: 0x000000000ac26000
> > > needed_size: 0x000000000b800000
> > > trampoline_32bit: 0x000000000009d000
> > >
> > > so that's a ~9M kernel which gets decompressed at 0x1000000 and the
> > > output len is, what, ~180M which looks like plenty to me...
> >
> > I think you might have misunderstood the thread.
>
> Maybe...
>
> I've been trying to make sense of all the decompression dancing we're doing and
> the diagrams you've drawn and there's a comment over extract_kernel() which
> basically says that the compressed image is at the back of the memory buffer
>
> input_data: 0x000000000be073a8
>
> and we decompress to a smaller address
>
> output: 0x0000000001000000
>
> And, it *looks* like setup_data is at an address somewhere >= 0x1000000 so when
> we start decompressing, we overwrite it.
>
> I guess extract_kernel() could also dump the setup_data addresses so that we can
> verify that...
>
> > First, to reproduce the bug that this patch fixes, you need a kernel with a
> > compressed size of around 16 megs, not 9.
>
> Not only that - you need setup_data to be placed somewhere just so that it gets
> overwritten during decompression. Also, see below.
>
> > Secondly, that crash is well understood and doesn't need to be reproduced;
>
> Is it?
>
> Where do you get the 0x100000 as the starting address of the kernel image?
>
> Because input_data above says that the input address is somewhere higher...

Look closer at the boot process. The compressed image is initially at
0x100000, but it gets relocated to a safer area at the end of
startup_64:

/*
 * Copy the compressed kernel to the end of our buffer
 * where decompression in place becomes safe.
 */
        pushq   %rsi
        leaq    (_bss-8)(%rip), %rsi
        leaq    rva(_bss-8)(%rbx), %rdi
        movl    $(_bss - startup_32), %ecx
        shrl    $3, %ecx
        std
        rep     movsq
        cld
        popq    %rsi

        /*
         * The GDT may get overwritten either during the copy we just did or
         * during extract_kernel below. To avoid any issues, repoint the GDTR
         * to the new copy of the GDT.
         */
        leaq    rva(gdt64)(%rbx), %rax
        leaq    rva(gdt)(%rbx), %rdx
        movq    %rdx, 2(%rax)
        lgdt    (%rax)

/*
 * Jump to the relocated address.
 */
        leaq    rva(.Lrelocated)(%rbx), %rax
        jmp     *%rax

And that address winds up being calculated with a combination of the
image size and the init_size param, so it's safe. Decompression won't
overwrite the compressed kernel.

HOWEVER, qemu currently appends setup_data to the end of the
compressed kernel image, and this part isn't moved, and setup_data
links aren't walked/relocated. So that means the original address
remains, of 0x100000.

Jason