[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220612163044.GS1790663@paulmck-ThinkPad-P17-Gen-1>
Date: Sun, 12 Jun 2022 09:30:44 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: John Ogness <john.ogness@...utronix.de>
Cc: linux-kernel@...r.kernel.org, frederic@...nel.org, pmladek@...e.com
Subject: Re: [BUG] 8e274732115f ("printk: extend console_lock for per-console
locking")
On Sun, Jun 12, 2022 at 06:09:10PM +0206, John Ogness wrote:
> Hi Paul,
>
> Thanks for looking into this! I am currently on vacation with family, so
> my responses are limited. Some initial comments from me below...
First, this is not an emergency. I have a good workaround that just got
done passing significant rcutorture testing. This means that I can port
my RCU changes to v5.19-rc1/2 and get on with other testing.
So please ignore this for the rest of your time away, and have a great
time with your family!!!
> On 2022-06-12, "Paul E. McKenney" <paulmck@...nel.org> wrote:
> > And the patch below takes care of things in (admittedly quite light)
> > testing thus far. What it does is add ten seconds of pure delay
> > before rcutorture shuts down the system. Presumably, this delay gives
> > printk() the time that it needs to flush its buffers. In the
> > configurations that I have tested thus far, anyway.
> >
> > So what should I be doing instead?
> >
> > o console_flush_on_panic() seems like strong medicine, but might
> > be the right thing to do. The bit about proceeding even though
> > it failed to acquire the lock doesn't look good for non-panic
> > use.
>
> For sure not this one.
>
> > o printk_trigger_flush() has an attractive name, but it looks
> > like it only just starts the flush rather than waiting for it
> > to finish.
>
> Correct. It just triggers.
>
> > o pr_flush(1000, true) looks quite interesting, and also seems to
> > work in a few quick tests, so I will continue playing with that.
>
> This is only useful if the context is guaranteed may_sleep().
Which is the case when called from torture_shutdown().
But it does seem to be common to invoke kernel_power_off() from things
like interrupt handlers. Which means that putting the pr_flush() in
kernel_power_off() would be a bad idea given that we cannot detect
non-preemptible regions of code with CONFIG_PREEMPT_NONE=y kernels.
(That again!)
So any fix within kernel_power_off() would be a bit "interesting".
> What is _supposed_ to happen is that @system_state increases above
> SYSTEM_RUNNING, which then causes direct printing to be used. So the
> pr_emerg("Power down\n") in kernel_power_off() would directly flush all
> remaining messages.
>
> But if the threaded printers are already in the process of printing,
> they block direct printing. That may be what we are seeing here.
Given that rcutorture can be a bit chatty at shutdown time, my guess
is that the threaded printers are already in the process of printing.
> What I find particularly interesting is that it is not the kthread-patch
> that is causing the issue.
I do know that feeling!
> I will have some time tonight to take a closer look.
Please wait until you are back from your vacation. Given that I now
have a workaround, which might be as good a fix as there is, there is
no need to interrupt your vacation.
Thanx, Paul
Powered by blists - more mailing lists