linux-kernel - Re: Panic on ppc64le using kernel 5.13.0-rc3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <45ea5042-9136-6f0c-144c-f09d05cd69ed@rasmusvillemoes.dk>
Date:   Fri, 11 Jun 2021 09:13:20 +0200
From:   Rasmus Villemoes <linux@...musvillemoes.dk>
To:     Bruno Goncalves <bgoncalv@...hat.com>
Cc:     linux-kernel@...r.kernel.org, CKI Project <cki-project@...hat.com>
Subject: Re: Panic on ppc64le using kernel 5.13.0-rc3

On 10/06/2021 17.14, Bruno Goncalves wrote:
> On Thu, Jun 10, 2021 at 3:02 PM Rasmus Villemoes
> <linux@...musvillemoes.dk> wrote:
>>
>> On 10/06/2021 13.47, Bruno Goncalves wrote:
>>> Hello,
>>>
>>> We've observed in some cases kernel panic when trying to boot on
>>> ppc64le using a kernel based on 5.13.0-rc3. We are not sure if it
>>> could be related to patch
>>> https://lore.kernel.org/lkml/20210313212528.2956377-2-linux@rasmusvillemoes.dk/
>>>
>>
>> Thanks for the report. It's possible, but I'll need some help from you
>> to get more info.
>>
>> First, can you send me the .config?
> 
> The .config is on
> https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2021/06/09/317881801/build_ppc64le_redhat:1332368174/kernel-block-ppc64le-d3f02e52f5548006f04358d407bbb7fe51255c41.config

Thanks.

>>
>>>
>>> [    1.516075] wait_for_initramfs() called before rootfs_initcalls
>>
>> This is likely because you have CONFIG_UEVENT_HELPER_PATH set to some
>> non-empty path (/sbin/hotplug perhaps). This did get reported once before:
>>
> 
> CONFIG_UEVENT_HELPER_PATH is not set. In the .config we have "#
> CONFIG_UEVENT_HELPER is not set"

OK. Then I assume some quite early initcall does a request_module() or
request_firmware() (or similar). I don't think this matters - that call
would be done before the initramfs was unpacked with or without my
patch, so it won't find anything in the empty rootfs. It's just my patch
added a note. But just to figure out where that triggers, can you do

-               pr_warn_once("wait_for_initramfs() called before
rootfs_initcalls\n");
+               WARN_ONCE(1, "wait_for_initramfs() called before
rootfs_initcalls\n");

in init/initramfs.c.

>>> [    1.764430] Initramfs unpacking failed: no cpio magic
>>
>> Whoa, that's not good. Did something scramble over the initramfs memory
>> while it was being unpacked? It's been .2 seconds since the start of the
>> unpacking, so it's unlikely the very beginning of the initramfs is corrupt.
>>
>> Can you try booting with initramfs_async=0 on the command line and see
>> if the kernel still crashes?
> 
> We are not able to reproduce it 100% of the time, but sure I can try
> with this option and see what happens.
> 
> We've also seen:
> Initramfs unpacking failed: junk within compressed archive
> 
> This can be seen on the other 2 console logs that I provided the link to.

Yes, I saw that. This, and the fact that it's not 100% reproducible, is
consistent with the problem being some race that happens to write over
the compressed initramfs image - sometimes, the decompressor can still
make sense of the bits, but the output is no longer a valid cpio
archive, and sometimes already the decompressor notices the corruption.

I wonder if there is some way to mark the pages occupied by the
compressed initramfs as read-only - what would hopefully trigger a nice
crash with a backtrace to whoever writes to that memory.

Rasmus