linux-kernel - Re: commit dd4cf8c9e1f4 leads to failed boot

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aAnp9rdPhRY52F7N@pathway.suse.cz>
Date: Thu, 24 Apr 2025 09:36:22 +0200
From: Petr Mladek <pmladek@...e.com>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Bert Karwatzki <spasswolf@....de>,
	"Aithal, Srikanth" <sraithal@....com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Kuniyuki Iwashima <kuniyu@...zon.com>,
	Mateusz Guzik <mjguzik@...il.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	John Ogness <john.ogness@...utronix.de>,
	Sergey Senozhatsky <senozhatsky@...omium.org>,
	linux-kernel@...r.kernel.org,
	Linux-Next Mailing List <linux-next@...r.kernel.org>
Subject: Re: commit dd4cf8c9e1f4 leads to failed boot

On Wed 2025-04-23 12:56:53, Paul E. McKenney wrote:
> On Wed, Apr 23, 2025 at 09:19:56PM +0200, Bert Karwatzki wrote:
> > Am Mittwoch, dem 23.04.2025 um 11:07 -0700 schrieb Paul E. McKenney:
> > > On Wed, Apr 23, 2025 at 08:49:08PM +0530, Aithal, Srikanth wrote:
> > > > On 4/23/2025 7:48 PM, Paul E. McKenney wrote:
> > > > > On Wed, Apr 23, 2025 at 07:09:42PM +0530, Aithal, Srikanth wrote:
> > > > > > On 4/23/2025 5:24 PM, Bert Karwatzki wrote:
> > > > > > > Since linux next-20250422 booting fails on my MSI Alpha 15 Laptop runnning
> > > > > > > debian sid. When booting kernel message appear on screen but no messages from
> > > > > > > init (systemd). There are also no logs written even thought emergency sync
> > > > > > > via magic sysrq works (a message is printed on screen), presumably because
> > > > > > > / is not mounted. I bisected this (from 6.15-rc3 to next-20250422) and found
> > > > > > > commit dd4cf8c9e1f4 as the first bad commit.
> > > > > > > Reverting commit dd4cf8c9e1f4 in next-20250422 fixes the issue.
> > > > > > 
> > > > > > 
> > > > > > Hello,
> > > > > > 
> > > > > > On AMD platform as well boot failed starting next-20250422, bisecting the
> > > > > > issue led me to same commit dd4cf8c9e1f4. I have attached kernel config and
> > > > > > logs.
> > > > > 
> > > > > Thank you all for the bisection and the report!
> > > > > 
> > > > > Please check out the predecessor of commit dd4cf8c9e1f4 ("ratelimit:
> > > > > Force re-initialization when rate-limiting re-enabled"):
> > > > > 
> > > > > 13fa70e052dd ("ratelimit: Allow zero ->burst to disable ratelimiting")
> > > > > 
> > > > > Then please apply the patch shown below, and let me know what happens?
> > > > > (Yes, I should have split that commit up...)
> > > > > 
> > > > > 							Thanx, Paul
> > > > > 
> > > > > ------------------------------------------------------------------------
> > > > > 
> > > > > diff --git a/lib/ratelimit.c b/lib/ratelimit.c
> > > > > index 04f16b8e24575..13ed636642270 100644
> > > > > --- a/lib/ratelimit.c
> > > > > +++ b/lib/ratelimit.c
> > > > > @@ -35,7 +35,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
> > > > >   	unsigned long flags;
> > > > >   	int ret;
> > > > > -	if (!interval || !burst)
> > > > > +	if (interval <= 0 || burst <= 0)
> > > > >   		return 1;
> > > > >   	/*
> > > > 
> > > > 
> > > > I applied above patch on top of 13fa70e052dd ("ratelimit: Allow zero ->burst
> > > > to disable ratelimiting") [linux-20250423]. This is fixing the boot issue.
> > > > 
> > > > Tested-by: Srikanth Aithal <sraithal@....com>
> > > 
> > > Thank you both, and to Bert for intuiting the correct -next commit!
> > > 
> > > Could you please try the next increment, which is this patch, again
> > > on top of 24ff89c63355 ("ratelimit: Allow zero ->burst to > disable
> > > ratelimiting")?
> > > 
> > > In the meantime, I will expose the version you two just tested to
> > > -next.
> > > 
> > > 							Thanx, Paul
> > > 
> > > ------------------------------------------------------------------------
> > > 
> > > diff --git a/lib/ratelimit.c b/lib/ratelimit.c
> > > index 04f16b8e24575..8f6c54f719ef2 100644
> > > --- a/lib/ratelimit.c
> > > +++ b/lib/ratelimit.c
> > > @@ -35,8 +35,10 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
> > >  	unsigned long flags;
> > >  	int ret;
> > >  
> > > -	if (!interval || !burst)
> > > +	if (interval <= 0 || burst <= 0) {
> > > +		ret = burst > 0;
> > >  		return 1;
> > > +	}
> > >  
> > >  	/*
> > >  	 * If we contend on this state's lock then just check if
> > 
> > If you set "ret = burst > 0", but "return 1" this will make no difference
> > (except in the case of a major compiler bug, probably), as I wrote in my other
> > email which overlapped yours, this fixes the issue in next-20250422:
> > 
> > diff --git a/lib/ratelimit.c b/lib/ratelimit.c
> > index b5c727e976d2..fc28f6cf8269 100644
> > --- a/lib/ratelimit.c
> > +++ b/lib/ratelimit.c
> > @@ -40,7 +40,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
> >          * interval says never limit.
> >          */
> >         if (interval <= 0 || burst <= 0) {
> > -               ret = burst > 0;
> > +               ret = 1;
> >                 if (!(READ_ONCE(rs->flags) & RATELIMIT_INITIALIZED) ||
> >                     !raw_spin_trylock_irqsave(&rs->lock, flags))
> >                         return ret;
> 
> You are quite right, your patch does fix the issue that you three say.

Honestly, I do not understand what a ratelimit user could cause this
issue. And I am not able to reproduce it on my test system (x86_64,
kvm). I mean that my system boots and I see the systemd meesages.

> Unfortunately, it prevents someone from completely suppressing output
> by setting burst to zero.  Could you please try the patch below?

I wondered whether some code used a non-initialized struct ratelimit_state.
I tried the following patch:

diff --git a/lib/ratelimit.c b/lib/ratelimit.c
index b5c727e976d2..f949a18e9c2b 100644
--- a/lib/ratelimit.c
+++ b/lib/ratelimit.c
@@ -35,6 +35,10 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
 	unsigned long flags;
 	int ret;
 
+	WARN_ONCE(interval <= 0 || burst <= 0,
+		  "Possibly using a non-initilized ratelimit struct with interval:%d, burst:%d\n",
+		  interval, burst);
+
 	/*
 	 * Non-positive burst says always limit, otherwise, non-positive
 	 * interval says never limit.


And it triggered:

[    2.874504] ------------[ cut here ]------------
[    2.875552] Possibly using a non-initilized ratelimit struct with interval:0, burst:0
[    2.876990] WARNING: CPU: 2 PID: 1 at lib/ratelimit.c:38 ___ratelimit+0x1e8/0x200
[    2.878435] Modules linked in:
[    2.879045] CPU: 2 UID: 0 PID: 1 Comm: swapper/0 Tainted: G        W           6.15.0-rc3-next-20250422-default+ #22 PREEMPT(full)  f5d77f8de4aec34e420e26410c34bcb56f692aae
[    2.881287] Tainted: [W]=WARN
[    2.882010] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-2-gc13ff2cd-prebuilt.qemu.org 04/01/2014
[    2.886452] RIP: 0010:___ratelimit+0x1e8/0x200
[    2.888405] Code: 00 00 e9 b5 fe ff ff 41 bc 01 00 00 00 e9 f2 fe ff ff 89 ea 44 89 e6 48 c7 c7 f8 40 eb 92 c6 05 b5 4d 0f 01 01 e8 28 a0 de fe <0f> 0b e9 71 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 80 00 00
[    2.891223] RSP: 0000:ffffcf1340013bd8 EFLAGS: 00010282
[    2.892033] RAX: 0000000000000000 RBX: ffff8a8cc2bfbaf0 RCX: 0000000000000000
[    2.893091] RDX: 0000000000000002 RSI: 00000000ffff7fff RDI: 00000000ffffffff
[    2.894158] RBP: 0000000000000000 R08: 00000000ffff7fff R09: ffff8a8d3fe3ffa8
[    2.895168] R10: 00000000ffff8000 R11: 0000000000000001 R12: 0000000000000000
[    2.896150] R13: ffffffff92e08d38 R14: ffff8a8cc369e400 R15: ffff8a8cc2e39f00
[    2.897138] FS:  0000000000000000(0000) GS:ffff8a8da6f3c000(0000) knlGS:0000000000000000
[    2.898224] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    2.899181] CR2: 0000000000000000 CR3: 0000000153256001 CR4: 0000000000370ef0
[    2.901865] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    2.903516] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    2.906593] Call Trace:
[    2.907143]  <TASK>
[    2.907582]  __ext4_msg+0x6e/0xa0
[    2.908132]  ? lock_is_held_type+0xd8/0x130
[    2.908755]  ext4_check_feature_compatibility+0x15e/0x2c0
[    2.909427]  __ext4_fill_super+0x543/0x1480
[    2.910049]  ext4_fill_super+0xcc/0x280
[    2.910641]  ? setup_bdev_super+0xfc/0x200
[    2.911265]  ? __pfx_ext4_fill_super+0x10/0x10
[    2.911882]  get_tree_bdev_flags+0x13e/0x1e0
[    2.912485]  vfs_get_tree+0x29/0xe0
[    2.912958]  ? capable+0x3a/0x60
[    2.913407]  do_new_mount+0x176/0x360
[    2.913920]  init_mount+0x5a/0x90
[    2.914389]  do_mount_root+0xa2/0x130
[    2.914923]  mount_root_generic+0xdd/0x270
[    2.916127]  ? mount_root+0x147/0x190
[    2.917989]  prepare_namespace+0x1e0/0x230
[    2.919124]  kernel_init_freeable+0x1ec/0x200
[    2.920907]  ? __pfx_kernel_init+0x10/0x10
[    2.922113]  kernel_init+0x1a/0x130
[    2.922616]  ret_from_fork+0x31/0x50
[    2.923093]  ? __pfx_kernel_init+0x10/0x10
[    2.923612]  ret_from_fork_asm+0x1a/0x30
[    2.924110]  </TASK>
[    2.924433] irq event stamp: 1696665
[    2.924955] hardirqs last  enabled at (1696675): [<ffffffff913fa54e>] __up_console_sem+0x5e/0x70
[    2.926072] hardirqs last disabled at (1696686): [<ffffffff913fa533>] __up_console_sem+0x43/0x70
[    2.927149] softirqs last  enabled at (1696612): [<ffffffff9135134e>] handle_softirqs+0x32e/0x400
[    2.928221] softirqs last disabled at (1696591): [<ffffffff91351509>] __irq_exit_rcu+0xd9/0x150
[    2.929167] ---[ end trace 0000000000000000 ]---
[    3.003162] EXT4-fs (vda2): mounted filesystem 587ae802-e330-4059-9b48-d5b845e1075a ro with ordered data mode. Quota mode: none.

I guess that it happens because the structure is initialized too late,
see:

static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
{

	[ ... skipping a lot of initialization code ... ]

	/* Enable message ratelimiting. Default is 10 messages per 5 secs. */
	ratelimit_state_init(&sbi->s_err_ratelimit_state, 5 * HZ, 10);
	ratelimit_state_init(&sbi->s_warning_ratelimit_state, 5 * HZ, 10);
	ratelimit_state_init(&sbi->s_msg_ratelimit_state, 5 * HZ, 10);

	[...]
}

I guess that it is on purpose. They most likely do not want to
ratelimit the _very initial messages_ printed when the initialization
fails.

Maybe, it is not a good idea to allow to disable the ratelimit by
zero burst.

Best Regards,
Petr