[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AF921575-0968-434A-8B46-095B78C209C1@zytor.com>
Date: Wed, 28 Dec 2022 23:31:34 -0800
From: "H. Peter Anvin" <hpa@...or.com>
To: "Jason A. Donenfeld" <Jason@...c4.com>
CC: pbonzini@...hat.com, ebiggers@...nel.org, x86@...nel.org,
linux-kernel@...r.kernel.org, qemu-devel@...gnu.org,
ardb@...nel.org, kraxel@...hat.com, bp@...en8.de, philmd@...aro.org
Subject: Re: [PATCH qemu] x86: don't let decompressed kernel image clobber setup_data
On December 28, 2022 6:31:07 PM PST, "Jason A. Donenfeld" <Jason@...c4.com> wrote:
>Hi,
>
>Read this message in a fixed width text editor with a lot of columns.
>
>On Wed, Dec 28, 2022 at 03:58:12PM -0800, H. Peter Anvin wrote:
>> Glad you asked.
>>
>> So the kernel load addresses are parameterized in the kernel image
>> setup header. One of the things that are so parameterized are the size
>> and possible realignment of the kernel image in memory.
>>
>> I'm very confused where you are getting the 64 MB number from. There
>> should not be any such limitation.
>
>Currently, QEMU appends it to the kernel image, not to the initramfs as
>you suggest below. So, that winds up looking, currently, like:
>
> kernel image setup_data
> |--------------------------||----------------|
>0x100000 0x100000+l1 0x100000+l1+l2
>
>The problem is that this decompresses to 0x1000000 (one more zero). So
>if l1 is > (0x1000000-0x100000), then this winds up looking like:
>
> kernel image setup_data
> |--------------------------||----------------|
>0x100000 0x100000+l1 0x100000+l1+l2
>
> d e c o m p r e s s e d k e r n e l
> |-------------------------------------------------------------|
> 0x1000000 0x1000000+l3
>
>The decompressed kernel seemingly overwriting the compressed kernel
>image isn't a problem, because that gets relocated to a higher address
>early on in the boot process. setup_data, however, stays in the same
>place, since those links are self referential and nothing fixes them up.
>So the decompressed kernel clobbers it.
>
>The solution in this commit adds a bunch of padding between the kernel
>image and setup_data to avoid this. That looks like this:
>
> kernel image padding setup_data
> |--------------------------||---------------------------------------------------||----------------|
>0x100000 0x100000+l1 0x1000000+l3 0x1000000+l3+l2
>
> d e c o m p r e s s e d k e r n e l
> |-------------------------------------------------------------|
> 0x1000000 0x1000000+l3
>
>This way, the decompressed kernel doesn't clobber setup_data.
>
>The problem is that if 0x1000000+l3-0x100000 is around 62 megabytes,
>then the bootloader crashes when trying to dereference setup_data's
>->len param at the end of initialize_identity_maps() in ident_map_64.c.
>I don't know why it does this. If I could remove the 62 megabyte
>restriction, then I could keep with this technique and all would be
>well.
>
>> In general, setup_data should be able to go anywhere the initrd can
>> go, and so is subject to the same address cap (896 MB for old kernels,
>> 4 GB on newer ones; this address too is enumerated in the header.)
>
>It would be theoretically possible to attach it to the initrd image
>instead of to the kernel image. As a last resort, I guess I can look
>into doing that. However, that's going to require some serious rework
>and plumbing of a lot of different components. So if I can make it work
>as is, that'd be ideal. However, I need to figure out this weird 62 meg
>limitation.
>
>Any ideas on that?
>
>Jason
As far as a crash... that sounds like a big and a pretty serious one at that.
Could you let me know what kernel you are using and how *exactly* you are booting it?
Powered by blists - more mailing lists