lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b74ike3nxjmhjsfdwq4b3732sgyojdfy5mwt2hflkc7aaqalnf@iclsvge73ibh>
Date: Wed, 15 Oct 2025 14:33:54 -0400
From: "Liam R. Howlett" <Liam.Howlett@...cle.com>
To: Guenter Roeck <linux@...ck-us.net>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Feng Chen <feng.chen@...ogic.com>,
        Matthew Wilcox <willy@...radead.org>, Jeff Layton <jlayton@...nel.org>,
        Michal Swiatkowski <michal.swiatkowski@...ux.intel.com>,
        Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
        Tao Ren <rentao.bupt@...il.com>,
        Lukas Bulwahn <lukas.bulwahn@...hat.com>,
        Alexei Starovoitov <ast@...nel.org>, Vlastimil Babka <vbabka@...e.cz>
Subject: Re: Linux 6.18-rc1

* Guenter Roeck <linux@...ck-us.net> [251015 13:48]:
> On 10/15/25 10:28, Liam R. Howlett wrote:
> > + Cc Vlastimil, as you are indicating the slab merge.
> > 
> > 
> > * Guenter Roeck <linux@...ck-us.net> [251015 06:02]:
> > > On Mon, Oct 13, 2025 at 09:46:44PM -0700, Guenter Roeck wrote:
> > > > On Mon, Oct 13, 2025 at 10:08:26AM -0700, Guenter Roeck wrote:
> > > > > On Sun, Oct 12, 2025 at 02:04:32PM -0700, Linus Torvalds wrote:
> > > > > > Two weeks have passed, and 6.18-rc1 has been tagged and pushed out.
> > > > > > 
> > > > > > Things look fairly normal: size-wise this is pretty much right in the
> > > > > > middle of the pack, and nothing particular stands out in the shortlog
> > > > > > of merges this merge window appended below. About half the diff is
> > > > > > drivers, with the res being all over: vfs and filesystems, arch
> > > > > > updates (although much of that is actually devicetree stuff, so it's
> > > > > > arguably more driver-related), tooling, rust support etc etc.
> > > > > > 
> > > > > > This was one of the good merge windows where I didn't end up having to
> > > > > > bisect any particular problem on nay of the machines I was testing.
> > > > > > Let's hope that success mostly translates to the bigger picture too.
> > > > > > 
> > > > > 
> > > > > Test results don't look that good, unfortunately.:
> > > > > 
> > > > ...
> > > > > Qemu test results:
> > > > > 	total: 609 pass: 581 fail: 28
> > > > > Failed tests:
> > > ...
> > > > > 	sheb:rts7751r2dplus_defconfig:initrd
> > > > > 	sheb:rts7751r2dplus_defconfig:ata:ext2
> > > > > 	sheb:rts7751r2dplus_defconfig:usb:ext2
> > > > > Unit test results:
> > > > > 	pass: 655208 fail: 0
> > > > > 
> > > > 
> > > 
> > > Update on the sheb (SH4 big endian) failures below.
> > 
> > What is the qemu line you use and the memory configuration of that qemu,
> > or is this real hardware?
> > 
> qemu. I tried 6.2.0, 10.0.5, and 10.1.1. Sample command line:
> 
> qemu-system-sh4eb -M r2d -kernel arch/sh/boot/zImage \
> 	-append "console=ttySC1,115200 noiotrap" \
> 	-serial null -serial stdio -monitor null -nographic -no-reboot
> 
> initrd or root file system doesn't really matter because qemu exits
> almost immediately.
> 
> > Are there sh4 configs that pass?
> > 
> 
> little endian - all
> big endian - none

Do other big endian targets work?

> 
> > It's a bit odd it says "fail: 0" here, Is this message about something
> > else?
> 
> This are unit (KUNIT) test results. All 655208 executed unit tests passed.
> Unit tests not executed because the image crashed or because qemu died are not
> counted as failed.

Thanks.

...

> 
> I checked out a test branch at 24d9e8b3c9c, rebased it on top of
> 24d9e8b3c9c8a6~1 (07fdad3a93756b8), and ran another bisect. Results:
> 
> # bad: [c5e19dc4c1db098456ee6a924e276a26e692f26c] slab: Introduce kmalloc_nolock() and kfree_nolock().
> # good: [07fdad3a93756b872da7b53647715c48d0f4a2d0] Merge tag 'net-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
> git bisect start 'HEAD' '07fdad3a93756b872da7b53647715c48d0f4a2d0'
> # good: [10f17a5a3befa328bd9a78ca6b799dd1933f108b] maple_tree: remove redundant __GFP_NOWARN
> git bisect good 10f17a5a3befa328bd9a78ca6b799dd1933f108b
> # good: [f97515baad5efa6e1963abd37188fad42515edc8] maple_tree: Replace mt_free_one() with kfree()
> git bisect good f97515baad5efa6e1963abd37188fad42515edc8
> # bad: [4df642aa2128c2c346f9c945bddbae37c59bba82] locking/local_lock: Introduce local_lock_is_locked().
> git bisect bad 4df642aa2128c2c346f9c945bddbae37c59bba82
> # good: [a20be9b8014abfe68acc2efd81bfb5d2dd4eaf34] maple_tree: Prefilled sheaf conversion and testing
> git bisect good a20be9b8014abfe68acc2efd81bfb5d2dd4eaf34
> # bad: [40696586bc008ad34db8135c35ec4b459691af3c] maple_tree: Convert forking to use the sheaf interface
> git bisect bad 40696586bc008ad34db8135c35ec4b459691af3c
> # good: [8387347ae261c5e74e9db3f73b91d47f11f8d6f8] maple_tree: Add single node allocation support to maple state
> git bisect good 8387347ae261c5e74e9db3f73b91d47f11f8d6f8
> # first bad commit: [40696586bc008ad34db8135c35ec4b459691af3c] maple_tree: Convert forking to use the sheaf interface
> 
> Reverting just 40696586bc008 in that branch didn't help. So I reverted "slab: Introduce
> kmalloc_nolock() and kfree_nolock()" in that branch as well, and the image started
> passing.

This does not make sense to me.  The first bad commit being reverted and
it does not work means that it's not to do with that patch..?

I'm not saying this patch is fine, but surely it indicates a previous
problem and potentially (most likely?) an intermittent failure?

Is the failure consistently reproduced?


> In mainline, 719a42e563bb ("maple_tree: Convert forking to use the sheaf interface")
> can be reverted, but trying to revert af92793e52c3 results in:
> CONFLICT (content): Merge conflict in mm/slub.c

Forking shouldn't be running so early that the console output is
affected, so I'm not sure how this change would cause what you are
describing.

Thanks,
Liam


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ