lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALvjV28xYqHJBkw0NFuCWTVZx=2A9E=BuRH79PAYUb=niwFRVQ@mail.gmail.com>
Date: Tue, 10 Sep 2024 12:53:46 -0700
From: Hugues Bruant <hugues.bruant@...il.com>
To: Borislav Petkov <bp@...en8.de>
Cc: stable@...r.kernel.org, regressions@...ts.linux.dev, 
	linux-kernel@...r.kernel.org, Fenghua Yu <fenghua.yu@...el.com>, 
	Reinette Chatre <reinette.chatre@...el.com>, Tony Luck <tony.luck@...el.com>, 
	Tzung-Bi Shih <tzungbi@...nel.org>, Brian Norris <briannorris@...omium.org>, 
	Julius Werner <jwerner@...omium.org>, chrome-platform@...ts.linux.dev, 
	Jani Nikula <jani.nikula@...ux.intel.com>, 
	Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>, Rodrigo Vivi <rodrigo.vivi@...el.com>, 
	Tvrtko Ursulin <tursulin@...ulin.net>, intel-gfx@...ts.freedesktop.org, 
	dri-devel@...ts.freedesktop.org
Subject: Re: [REGRESSION] soft lockup on boot starting with kernel 6.10 /
 commit 5186ba33234c9a90833f7c93ce7de80e25fac6f5

On Mon, Sep 9, 2024 at 1:02 AM Borislav Petkov <bp@...en8.de> wrote:
>
> On Sun, Sep 08, 2024 at 11:53:56PM -0700, Hugues Bruant wrote:
> > Hi,
> >
> > I have discovered a 100% reliable soft lockup on boot on my laptop:
> > Purism Librem 14, Intel Core i7-10710U, 48Gb RAM, Samsung Evo Plus 970
> > SSD, CoreBoot BIOS, grub bootloader, Arch Linux.
> >
> > The last working release is kernel 6.9.10, every release from 6.10
> > onwards reliably exhibit the issue, which, based on journalctl logs,
> > seems to be triggered somewhere in systemd-udev:
> > https://gitlab.archlinux.org/-/project/42594/uploads/04583baf22189a0a8bb2f8773096e013/lockup.log
> >
> > Bisect points to commit 5186ba33234c9a90833f7c93ce7de80e25fac6f5
>
> That's a merge commit. Meaning, the bisection likely went into the wrong
> direction.
I double-checked and the bisection results seem quite consistent.
While merge commits are unlikely to be correct bisection results,
they're entirely possible if the bug is triggered by an unexpected
interaction between multiple unrelated commits.

> However, you have out-of-tree modules. Try reproducing it without them.
That was the first suggestion on the Arch bug tracker. The whole
bisection was done without out-of-tree modules.

Now, for the fun part: the kind soul on the Arch bugtracker who
provided me with the kernel images for bisection built a patched
6.10.9 at my request, reverting just Tony's RDT changes that were
flagged by the bisection: bd4955d4bc2182ccb660c9c30a4dd7f36feaf943 and
e3ca96e479c91d6ee657d3caa5092a6a3a620f9f

That patch bring the boot success rate on my machine from 0/10 up to
4/10, even though this code is not supposed to be used, its presence
is clearly impactful!

The framebuffer fix seems to also have a positive (though smaller,
closer to 20%) impact on boot success rate, so I'm planning to test
the combination of both as a next step.

See some extra boot logs attached

View attachment "lockup-patch.log" of type "text/x-log" (107430 bytes)

View attachment "no-lockup-patch.log" of type "text/x-log" (116550 bytes)

View attachment "no-lockup-patch-2.log" of type "text/x-log" (115849 bytes)

View attachment "no-lockup-patch-1.log" of type "text/x-log" (115997 bytes)

View attachment "lockup-patch-1.log" of type "text/x-log" (99947 bytes)

View attachment "lockup-patch-4.log" of type "text/x-log" (70148 bytes)

View attachment "lockup-patch-3.log" of type "text/x-log" (70878 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ