lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3a0ceabe-d5a0-4cb3-8343-ec56bc9bd0fd@paulmck-laptop>
Date: Wed, 5 Feb 2025 12:10:53 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: John Ogness <john.ogness@...utronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@...utronix.de>, rcu@...r.kernel.org,
	linux-kernel@...r.kernel.org, kernel-team@...a.com,
	rostedt@...dmis.org, Frederic Weisbecker <frederic@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Alexei Starovoitov <ast@...nel.org>,
	Andrii Nakryiko <andrii@...nel.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Masami Hiramatsu <mhiramat@...nel.org>,
	linux-trace-kernel@...r.kernel.org, Petr Mladek <pmladek@...e.com>
Subject: Re: [PATCH rcu v2] 4/5] rcu-tasks: Move RCU Tasks self-tests to
 core_initcall()

On Wed, Feb 05, 2025 at 09:00:07PM +0106, John Ogness wrote:
> On 2025-02-05, "Paul E. McKenney" <paulmck@...nel.org> wrote:
> >> This is caused by RCU falling behind a callback-flooding kthread that
> >> invokes call_rcu() in a semi-tight loop.  Setting rcutree.kthread_prio=40
> >> avoids the splat, but still gets the shutdown-time hang.  Retrying with
> >> the default rcutree.kthread_prio=2 failed to reproduce the splat, but
> >> it did reproduce the shutdown-time hang.
> >> 
> >> OK, maybe printk buffers are not being flushed?  A 100-millisecond sleep
> >> at the end of of rcu_torture_cleanup() got all of rcutorture's output
> >> flushed, but lost the subsequent shutdown-time console traffic.  The
> >> pr_flush(HZ/10,1) seems more sensible, but this is private to printk().
> >>
> >> I would like to log the shutdown-time console traffic because RCU can
> >> sometimes break things on that path.
> 
> pr_flush() was changed to private because there were no users. It would
> not be a problem to make it available. Adding a pr_flush() to
> rcu_torture_cleanup() would be an appropriate workaround for now (more
> on this at the end).

Ah, got it.

> > There is a call to kmsg_dump(KMSG_DUMP_SHUTDOWN) in kernel_power_off()
> > that appears to be intended to dump out the printk() buffers,
> 
> It only dumps the buffers to the registered kmsg_dumpers. It is not
> responsible for flushing console backlogs.

That would explain its not doing much for me.  ;-)

> > but it
> > does not seem to do so in kernels built with CONFIG_PREEMPT_RT=y.
> > Does there need to be a pr_flush() call prior to the call to
> > migrate_to_reboot_cpu()?  Or maybe even to do_kernel_power_off_prepare()
> > or kernel_shutdown_prepare()?
> 
> With CONFIG_PREEMPT_RT=y, legacy consoles only print via a dedicated
> kthread. Without a pr_flush() somewhere, there is basically no chance
> that they will get backlogs flushed because noone is waitig for them.
> 
> The new console API (NBCON) provides support for "atomic consoles",
> which _do_ flush by transitioning to synchronous printing during
> shutdown/reboot. Unfortunately we still don't have any NBCON atomic
> console implemented in the kernel. The 8250 UART will be our first
> driver, most likely available in 6.15. (With the current PREEMPT_RT
> patch applied, the 8250 NBCON atomic driver is used.)
> 
> Since only CONFIG_PREEMPT_RT=y has this issue, I am not sure if we want
> to sprinkle pr_flush() calls on all sleepable shutdown/reboot paths,
> although that is certainly one way to handle it. For your case, adding a
> pr_flush() to rcu_torture_cleanup() and making pr_flush() non-private
> would be an easy solution to avoid your problem.

Would it make sense to put a call to pr_flush() at the beginning of the
kernel_power_off() function?  I suspect that would take care of much
more than just rcutorture.

Or maybe after the pr_emerg("Power down\n"), since that is normally
the last thing that shows up on rcutorture's console logs.

I will give that a shot and see what happens.  ;-)

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ