lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87mtq6g1kx.fsf@oc8242746057.ibm.com>
Date:   Wed, 28 Jul 2021 12:44:14 +0200
From:   Alexander Egorenkov <egorenar@...ux.ibm.com>
To:     Luis Chamberlain <mcgrof@...nel.org>,
        Bruno Goncalves <bgoncalv@...hat.com>
Cc:     Rasmus Villemoes <linux@...musvillemoes.dk>,
        akpm@...ux-foundation.org, bp@...en8.de, corbet@....net,
        gregkh@...uxfoundation.org, jeyu@...nel.org,
        linux-kernel@...r.kernel.org,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        torvalds@...ux-foundation.org, Dave Young <dyoung@...hat.com>
Subject: Re: [PATCH v3 1/2] init/initramfs.c: do unpacking asynchronously

Luis Chamberlain <mcgrof@...nel.org> writes:

> On Tue, Jul 27, 2021 at 04:27:08PM +0200, Bruno Goncalves wrote:
>> On Tue, Jul 27, 2021 at 4:21 PM Luis Chamberlain <mcgrof@...nel.org> wrote:
>> >
>> > On Tue, Jul 27, 2021 at 04:12:54PM +0200, Bruno Goncalves wrote:
>> > > On Tue, Jul 27, 2021 at 3:55 PM Luis Chamberlain <mcgrof@...nel.org> wrote:
>> > > >
>> > > > On Tue, Jul 27, 2021 at 09:31:54AM +0200, Bruno Goncalves wrote:
>> > > > > On Mon, Jul 26, 2021 at 1:46 PM Rasmus Villemoes
>> > > > > <linux@...musvillemoes.dk> wrote:
>> > > > > >
>> > > > > > On 24/07/2021 09.46, Alexander Egorenkov wrote:
>> > > > > > > Hello,
>> > > > > > >
>> > > > > > > since e7cb072eb988 ("init/initramfs.c: do unpacking asynchronously"), we
>> > > > > > > started seeing the following problem on s390 arch regularly:
>> > > > > > >
>> > > > > > > [    5.039734] wait_for_initramfs() called before rootfs_initcalls
>> > > >
>> > > > So some context here, which might help.
>> > > >
>> > > > The initramfs_cookie is initialized until a a rootfs_initcall() is
>> > > > called, in this case populate_rootfs(). The code is small, so might
>> > > > as well include it:
>> > > >
>> > > > static int __init populate_rootfs(void)
>> > > > {
>> > > >         initramfs_cookie = async_schedule_domain(do_populate_rootfs, NULL,
>> > > >                                                  &initramfs_domain);
>> > > >         if (!initramfs_async)
>> > > >                 wait_for_initramfs();
>> > > >         return 0;
>> > > > }
>> > > > rootfs_initcall(populate_rootfs);
>> > > >
>> > > > The warning you see comes from a situation where a wait_for_initramfs()
>> > > > gets called but we haven't yet initialized initramfs_cookie.  There are
>> > > > only a few calls for wait_for_initramfs() in the kernel, and the only
>> > > > thing I can think of is that somehow s390 may rely on a usermode helper
>> > > > early on, but not every time.
>> > > >
>> > > > What umh calls does s390 issue?
>> > > >
>> > > > > Unfortunately, we haven't been able to find the root cause, but since
>> > > > > June 23rd we haven't hit this panic...
>> > > > >
>> > > > > Btw, this panic we were hitting only when testing kernels from "scsi"
>> > > > > and "block" trees.
>> > > >
>> > > > Do you use drdb maybe?
>> > >
>> > > No, the machines we were able to reproduce the problem don't have drdb.
>> >
>> > Are there *any* umh calls early on boot on the s390 systems? If so
>> > chances are that is the droid you are looking for.
>> 
>> Sorry Luis,
>> 
>> I was just replying the question mentioning an old thread
>> (https://lore.kernel.org/lkml/CA+QYu4qxf2CYe2gC6EYnOHXPKS-+cEXL=MnUvqRFaN7W1i6ahQ@mail.gmail.com/T/#u)
>> on ppc64le.
>> 
>> regarding the "umh" it doesn't show anything on ppc64le boot.
>
> There is not a single pr_*() call on kernel/umh.c, and so unless the
> respective ppc64le / s390 umh callers have a print, we won't know if you
> really did use a print.

I instrumented the UMH code and it seems that all wait_for_initramfs()
are triggered by request_module() from drbg.

>
> Can you reproduce the failure? How often?
>
>   Luis

The failure can be reproduced almost daily but on only one special test
machine and not immediately but after running many tests. I instrumented
our devel kernel in order to find out when/how the initramfs is being corrupted.

Still not reproducible on my own test machine. Very weird.

I'll report back as soon as we have something tangible.

Regards
Alex

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ