linux-kernel - Re: [BUG -next] WARNING: kernel/printk/printk_ringbuffer.c:1278 at get

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aRWKq2KNKjxbXexA@pathway.suse.cz>
Date: Thu, 13 Nov 2025 08:37:15 +0100
From: Petr Mladek <pmladek@...e.com>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: linux-kernel@...r.kernel.org, linux-next@...r.kernel.org,
	d-tatianin@...dex-team.ru, john.ogness@...utronix.de,
	sfr@...b.auug.org.au, rostedt@...dmis.org, senozhatsky@...omium.org
Subject: Re: [BUG -next] WARNING: kernel/printk/printk_ringbuffer.c:1278 at
 get_data+0xb3/0x100

Hi Paul,

first, thanks a lot for reporting the regression.

On Wed 2025-11-12 16:52:16, Paul E. McKenney wrote:
> Hello!
> 
> Some rcutorture runs on next-20251110 hit the following error on x86:
> 
> WARNING: kernel/printk/printk_ringbuffer.c:1278 at get_data+0xb3/0x100, CPU#0: rcu_torture_sta/63
> 
> This happens in about 20-25% of the rcutorture runs, and is the
> WARN_ON_ONCE(1) in the "else" clause of get_data().  There was no
> rcutorture scenario that failed to reproduce this bug, so I am guessing
> that the various .config files will not provide useful information.
> Please see the end of this email for a representative splat, which is
> usually rcutorture printing out something or another.  (Which, in its
> defense, has worked just fine in the past.)
> 
> Bisection converged on this commit:
> 
> 67e1b0052f6b ("printk_ringbuffer: don't needlessly wrap data blocks around")
> 
> Reverting this commit suppressed (or at least hugely reduced the
> probability of) the WARN_ON_ONCE().
> 
> The SRCU-T, SRCU-U, and TREE09 scenarios hit this most frequently at
> about double the base rate, but are CONFIG_SMP=n builds.  The RUDE01
> scenario was the most productive CONFIG_SMP=y scenario.  Reproduce as
> follows, where "N" is the number of CPUs on your system divided by three,
> rounded down:
> 
> tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 5 --configs "N*RUDE01"
> 
> Or if you can do CONFIG_SMP=n, the following works, where "N" is the
> number of CPUs on your system:
> 
> tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 5 --configs "N*SRCU-T"
> 
> Or please tell me what debug I should enable on my runs.

The problem was reported by two test robots last week. It happens when
a message fits exactly up to the last byte before the ring buffer gets
wrapped for the first time. It is interesting that you have seen
so frequently (in about 20-25% rcutorture runs).

Anyway, I have pushed a fix on Monday. It is the commit
cc3bad11de6e0d601 ("printk_ringbuffer: Fix check of
valid data size when blk_lpos overflows"), see
https://git.kernel.org/pub/scm/linux/kernel/git/printk/linux.git/commit/?h=for-6.19&id=cc3bad11de6e0d6012460487903e7167d3e73957

Thanks a lot for so exhaustive report. And I am sorry that you
probably spent a lot of time with it.

Best Regards,
Petr