lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 19 Oct 2023 17:12:59 +0300
From:   Andy Shevchenko <andriy.shevchenko@...el.com>
To:     Jan Kara <jack@...e.cz>, Nathan Chancellor <nathan@...nel.org>,
        Josh Poimboeuf <jpoimboe@...nel.org>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        Kees Cook <keescook@...omium.org>
Cc:     Ferry Toth <ftoth@...londelft.nl>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-fsdevel@...r.kernel.org, linux-ext4@...r.kernel.org
Subject: Re: [GIT PULL] ext2, quota, and udf fixes for 6.6-rc1

+Cc: compiler related guys (as far as my heuristics work).
Any ideas? (see below)

On Thu, Oct 19, 2023 at 03:01:43PM +0300, Andy Shevchenko wrote:
> On Thu, Oct 19, 2023 at 12:18:54PM +0200, Jan Kara wrote:
> > On Thu 19-10-23 11:46:58, Andy Shevchenko wrote:
> > > On Wed, Oct 18, 2023 at 08:46:13PM +0200, Jan Kara wrote:
> > > > On Tue 17-10-23 19:02:52, Andy Shevchenko wrote:
> > > > > On Tue, Oct 17, 2023 at 06:34:50PM +0300, Andy Shevchenko wrote:
> > > > > > On Tue, Oct 17, 2023 at 06:14:54PM +0300, Andy Shevchenko wrote:
> > > > > > > On Tue, Oct 17, 2023 at 05:50:10PM +0300, Andy Shevchenko wrote:
> > > > > > > > On Tue, Oct 17, 2023 at 04:42:29PM +0300, Andy Shevchenko wrote:
> > > > > > > > > On Tue, Oct 17, 2023 at 03:32:45PM +0200, Jan Kara wrote:
> > > > > > > > > > On Tue 17-10-23 14:46:20, Andy Shevchenko wrote:
> > > > > > > > > > > On Tue, Oct 17, 2023 at 01:32:53PM +0300, Andy Shevchenko wrote:
> > > > > > > > > > > > On Tue, Oct 17, 2023 at 01:29:27PM +0300, Andy Shevchenko wrote:
> > > > > > > > > > > > > On Tue, Oct 17, 2023 at 01:27:19PM +0300, Andy Shevchenko wrote:
> > > > > > > > > > > > > > On Wed, Aug 30, 2023 at 12:24:34PM +0200, Jan Kara wrote:
> > > > > > > > > > > > > > >   Hello Linus,

...

> > > > > > > > > > > > > > This merge commit (?) broke boot on Intel Merrifield.
> > > > > > > > > > > > > > It has earlycon enabled and only what I got is watchdog
> > > > > > > > > > > > > > trigger without a bit of information printed out.
> > > > > > > > > > > 
> > > > > > > > > > > Okay, seems false positive as with different configuration it
> > > > > > > > > > > boots. It might be related to the size of the kernel itself.
> > > > > > > > > > 
> > > > > > > > > > Ah, ok, that makes some sense.
> > > > > > > > > 
> > > > > > > > > I should have mentioned that it boots with the configuration say "A",
> > > > > > > > > while not with "B", where "B" = "A" + "C" and definitely the kernel
> > > > > > > > > and initrd sizes in the "B" case are bigger.
> > > > > > > > 
> > > > > > > > If it's a size (which is only grew from 13M->14M), it's weird.
> > > > > > > > 
> > > > > > > > Nevertheless, I reverted these in my local tree
> > > > > > > > 
> > > > > > > > 85515a7f0ae7 (HEAD -> topic/mrfld) Revert "defconfig: enable DEBUG_SPINLOCK"
> > > > > > > > 786e04262621 Revert "defconfig: enable DEBUG_ATOMIC_SLEEP"
> > > > > > > > 76ad0a0c3f2d Revert "defconfig: enable DEBUG_INFO"
> > > > > > > > f8090166c1be Revert "defconfig: enable DEBUG_LIST && DEBUG_OBJECTS_RCU_HEAD"
> > > > > > > > 
> > > > > > > > and it boots again! So, after this merge something affects one of this?
> > > > > > > > 
> > > > > > > > I'll continuing debugging which one is a culprit, just want to share
> > > > > > > > the intermediate findings.
> > > > > > > 
> > > > > > > CONFIG_DEBUG_LIST with this merge commit somehow triggers this issue.
> > > > > > > Any ideas?
> > > > > 
> > > > > > Dropping CONFIG_QUOTA* helps as well.
> > > > > 
> > > > > More precisely it's enough to drop either from CONFIG_DEBUG_LIST and CONFIG_QUOTA
> > > > > to make it boot again.
> > > > > 
> > > > > And I'm done for today.
> > > > 
> > > > OK, thanks for debugging! So can you perhaps enable CONFIG_DEBUG_LIST
> > > > permanently in your kernel config and then bisect through the quota changes
> > > > in the merge? My guess is commit dabc8b20756 ("quota: fix dqput() to follow
> > > > the guarantees dquot_srcu should provide") might be the culprit given your
> > > > testing but I fail to see how given I don't expect any quotas to be used
> > > > during boot of your platform... BTW, there's also fixup: 869b6ea160
> > > > ("quota: Fix slow quotaoff") merged last week so you could try testing a
> > > > kernel after this fix to see whether it changes anything.
> > > 
> > > It's exactly what my initial report is about, CONFIG_DEBUG_LIST was there
> > > always with CONFIG_QUOTA as well.
> > 
> > Ah, ok.
> > 
> > > Two bisections (v6.5 .. v6.6-rc1 & something...v6.6-rc6) pointed out to
> > > merge commit!
> > 
> > I thought CONFIG_DEBUG_LIST arrived through one path, some problematic
> > quota change arrived through another path and because they cause problems
> > only together, then bisecting to the merge would be exactly the outcome.
> > Alas that doesn't seem to be the case :-|.
> > 
> > > I _had_ tried to simply revert the quota changes (I haven't
> > > said about that before) and it didn't help. I'm so puzzled with all this.
> > 
> > Aha, OK. If even reverting quota changes doesn't help, then it's really
> > weird...
> 
> Lemme to confirm that, it might be that I forgot to update configuration in
> between.

So, what I have done so far.
1) I have cleaned ccaches and stuff as I used it to avoid collisions;
2) I have confirmed that CONFIG_DEBUG_LIST affects boot, the repo
   I'm using is published here [0][1];
3) reverted quota patches until before this merge ([2] - last patch),
   still boots;
4) reverted disabling of CONFIG_DEBUG_LIST [2], doesn't boot;
5) okay, rebased on top of merge, i.e. 1500e7e0726e,  with DEBUG_LIST [3],
   doesn't boot;
6) rebased [3] on one merge before, i.e. 63580f669d7f [4], voilĂ  -- it boots!;

And (tadaam!) I have had an idea for a while to replace GCC with LLVM
(at least for this test), so [0] boots as well!

So, this merge triggered a bug in GCC, seems like... And it's _the_ merge
commit, which is so-o weird!

$ gcc --version
gcc (Debian 13.2.0-4) 13.2.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

[0]: https://bitbucket.org/andy-shev/linux/src/test-mrfld-dbg-list/
[1]: https://bitbucket.org/andy-shev/linux/src/test-mrfld/
[2]: https://bitbucket.org/andy-shev/linux/src/test-mrfld-no-quota-dbg-list/
[3]: https://bitbucket.org/andy-shev/linux/src/test-mrfld-after-merge-dbg-list/
[4]: https://bitbucket.org/andy-shev/linux/src/test-mrfld-before-merge/

-- 
With Best Regards,
Andy Shevchenko


Powered by blists - more mailing lists