linux-kernel - Re: [PATCH 3/5] arm: decompressor: define a new zImage tag

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dleftjmu5lqw72.fsf%l.stelmach@samsung.com>
Date:   Tue, 02 Jun 2020 18:17:05 +0200
From:   Lukasz Stelmach <l.stelmach@...sung.com>
To:     Russell King - ARM Linux admin <linux@...linux.org.uk>
Cc:     Masahiro Yamada <masahiroy@...nel.org>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Enrico Weigelt <info@...ux.net>,
        Kees Cook <keescook@...omium.org>,
        Ingo Molnar <mingo@...nel.org>,
        Ben Dooks <ben-linux@...ff.org>,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        AKASHI Takahiro <takahiro.akashi@...aro.org>,
        Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>,
        Marek Szyprowski <m.szyprowski@...sung.com>
Subject: Re: [PATCH 3/5] arm: decompressor: define a new zImage tag

It was <2020-06-01 pon 21:41>, when Russell King - ARM Linux admin wrote:
 > On Mon, Jun 01, 2020 at 10:27:45PM +0200, Lukasz Stelmach wrote:
>> It was <2020-06-01 pon 19:25>, when Russell King - ARM Linux admin wrote:
>>> On Mon, Jun 01, 2020 at 06:19:52PM +0200, Lukasz Stelmach wrote:
>>>> It was <2020-06-01 pon 15:55>, when Russell King - ARM Linux admin wrote:
>>>>> On Mon, Jun 01, 2020 at 04:27:52PM +0200, Łukasz Stelmach wrote:
>>>>>> Add DCSZ tag which holds dynamic memory (stack, bss, malloc pool)
>>>>>> requirements of the decompressor code.
>>>>>
>>>>> Why do we need to know the stack and BSS size, when the userspace
>>>>> kexec tool doesn't need to know this to perform the same function.
>>>> 
>>>> 
>>>> kexec-tools load zImage as low in DRAM as possible and rely on two
>>>> assumptions:
>>>> 
>>>>     + the zImage will copy itself to make enough room for the kernel,
>>>>     + sizeof(zImage+mem) < sizeof(kernel+mem), which is true because
>>>>       of compression.
>>>> 
>>>>        DRAM start
>>>>        + 0x8000
>>>> 
>>>> zImage    |-----------|-----|-------|
>>>>             text+data   bss   stack 
>>>> 
>>>>                  text+data           bss   
>>>> kernel    |---------------------|-------------------|
>>>> 
>>>> 
>>>> initrd                                              |-initrd-|-dtb-|
>>>
>>> This is actually incorrect, because the decompressor will self-
>>> relocate itself to avoid the area that it is going to decompress
>>> into.
>> 
>> I described the state right after kexec(8) invocation.
>
> Actually, you haven't, because this is _not_ how kexec(8) lays it
> out, as I attempted to detail further down in my reply.


Let me try to describe how I understand the code in kexec-tools
(commit 74c7c369).

--8<---------------cut here---------------start------------->8---
int zImage_arm_load(…, const char *buf, off_t len, …)
// buf - zImage
// len - size of zImage
        
unsigned int extra_size = 0x8000; /* TEXT_OFFSET */

kernel_mem_size = len + 4;

// locate a hole to fit zImage + 32kB as low as possible,
base = locate_hole(info, len + extra_size, 0, 0, ULONG_MAX, INT_MAX);

kernel_base = base + extra_size;

add_segment(info, buf, len, kernel_base, kernel_mem_size);
--8<---------------cut here---------------end--------------->8---

Therefore, zImage is loaded low and always requires relocation.
        
ram     |--------------------------------------------------------------
zImage     |----k_m_s----|
           ^
           |
           kernel_base — TEXT_OFFSET or higher

Next goes initrd

--8<---------------cut here---------------start------------->8---
kexec_arm_image_size = len * 5; // or passed on command line

// if the tag exists
kexec_arm_image_size = max(edata_size + bss_size,
                           edata_size + len); // len - zImage size + 64 kB 

initrd_base = kernel_base + _ALIGN(kexec_arm_image_size, page_size);

add_segment(info, ramdisk_buf, initrd_size, initrd_base, initrd_size);
--8<---------------cut here---------------end--------------->8---

above whatever is bigger (kernel + kernel bss) OR (kernel + zImage + zImage mem).


ram     |---------------------------------------------------------------
zImage     |----k_m_s----|                   Where kexec loads zImage @kernel_base

           |.............len * 5....................| Fallback
kernel     |.....edata.....|...bss...|       These are just calculations
zImage                     |.....len+....|   zImage will copy itself here WHEN it runs

initrd                                   |--initrd_size--|
dtb                                      ^               |---|
                                         |
                                 initrd_base

DTB, of course, goes next

    dtb_offset = initrd_base + initrd_size + page_size;


Stuff marked with "-" is actually loaded, "." are just calculations to
establish initrd_base.

>>> So, while the decompressor runs, in the above situation it
>>> ends up as:
>>>
>>>
>>> ram    |------------------------------------------------------...
>>>                  text+data           bss   
>>> kernel    |---------------------|-------------------|
>>> zImage                          |-----------|-----|-------|
>>>                                   text+data   bss   stack+malloc
>
> Note here - if the initrd was placed as _you_ describe at the end
> of where the zImage ends up including its BSS, it would be
> corrupted by the stack and malloc space of the decompressor while
> running.  Ergo, your description of how kexec(8) lays stuff out
> is incorrect.

Is my analysis above accurate now? Do I understand this?

As you noted, my intention is to load zImage after edata (dotted len+
above).

>>>>>> +struct arm_zimage_tag_dc {
>>>>>> +	struct tag_header hdr;
>>>>>> +	union {
>>>>>> +#define ZIMAGE_TAG_DECOMP_SIZE ARM_ZIMAGE_MAGIC4
>>>>>> +		struct zimage_decomp_size {
>>>>>> +			__le32 bss_size;
>>>>>> +			__le32 stack_size;
>>>>>> +			__le32 malloc_size;
>>>>>> +		} decomp_size;
>>>
>>> You certainly don't need to know all this.  All you need to know is
>>> how much space the decompressor requires after the end of the image,
>>> encompassing the BSS size, stack size and malloc size, which is one
>>> value.
>> 
>> I agree. However, since we are not fighting here for every single byte,
>> I'd rather add them as separate values and make the tag, even if only
>> slightly, more future-proof.
>
> It doesn't make it more future-proof.  What happens if we add something
> else, do we need to list it separately, and then change the kernel to
> accept the new value and maybe also kexec(8)?  Or do we just say "the
> decompressor needs X many bytes after the image" and be done with it?
> The latter sounds way more future-proof to me.

You are right. I changed it to a single value. Done.

-- 
Łukasz Stelmach
Samsung R&D Institute Poland
Samsung Electronics

Download attachment "signature.asc" of type "application/pgp-signature" (488 bytes)