[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZSz7RFqv994u/Vt+@darkstar.users.ipa.redhat.com>
Date: Mon, 16 Oct 2023 16:58:44 +0800
From: Dave Young <dyoung@...hat.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Petr Mladek <pmladek@...e.com>,
Linus Torvalds <torvalds@...uxfoundation.org>,
Sergey Senozhatsky <senozhatsky@...omium.org>,
Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>,
linux-kernel@...r.kernel.org,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
kexec@...ts.infradead.org, bhe@...hat.com, prudo@...hat.com,
ebiederm@...ssion.com, vgoyal@...hat.com
Subject: Re: panic context: was: Re: [PATCH printk v2 04/11] printk: nbcon:
Provide functions to mark atomic write sections
[Added more people in cc]
On 10/08/23 at 12:19pm, John Ogness wrote:
> Hi Petr,
>
> On 2023-10-06, Petr Mladek <pmladek@...e.com> wrote:
> >> During the demo at LPC2022 we had the situation that there was a large
> >> backlog when a WARN was hit. With current mainline the first line of the
> >> WARN is put into the ringbuffer and then the entire backlog is flushed
> >> before storing the rest of the WARN into the ringbuffer. At the time it
> >> was obvious that we should finish storing the WARN message and then
> >> start flushing the backlog.
> >
> > This talks about the "emergency" context (WARN/OOPS/watchdog).
> > The system might be in big troubles but it would still try to continue.
> >
> > Do we really want to defer the flush also for panic() context?
>
> We can start flushing right after the backtrace is in the
> ringbuffer. But flushing the backlog _before_ putting the backtrace into
> the ringbuffer was not desired because if there is a large backlog, the
> machine may not survive to get the backtrace out. And in that case it
> won't even be in the ringbuffer to be used by other debugging
> tools.
>
> > I ask because I was not on LPC 2022 in person and I do not remember
> > all details.
>
> The LPC2022 demo/talk was recorded:
>
> https://www.youtube.com/watch?v=TVhNcKQvzxI
>
> At 55:55 is where the situation occurred and triggered the conversation,
> ultimately leading to this new feature.
>
> You may also want to reread my summary:
>
> https://lore.kernel.org/lkml/875yheqh6v.fsf@jogness.linutronix.de
>
> as well as Linus' follow-up message:
>
> https://lore.kernel.org/lkml/CAHk-=wieXPMGEm7E=Sz2utzZdW1d=9hJBwGYAaAipxnMXr0Hvg@mail.gmail.com
>
> > But it is tricky in panic(), see 8th patch at
> > https://lore.kernel.org/r/20230919230856.661435-9-john.ogness@linutronix.de
> >
> > + nbcon_atomic_exit() is called only in one code path.
>
> Correct. When panic() is complete and the machine goes into its infinite
> loop. This is also the point where it will attempt an unsafe flush, if
> it could not get the messages out yet.
>
> > + nbcon_atomic_flush_all() is used in other paths. It looks like
> > a "Whack a mole" game to me.
>
> Several different outputs occur during panic(). The flush is everywhere
> where something significant has been put into the ringbuffer and now it
> would be OK to flush it.
>
> > + messages are never emitted by printk kthread either because
> > CPUs are stopped or the kthread is not allowed to get the lock
>
> Correct.
>
> > I see only one positive of the explicit flush. The consoles would
> > not delay crash_exec() and the crash dump might be closer to
> > the point where panic() was called.
>
> It's only about getting the critical messages into the ringbuffer before
> flushing. And since various things can go wrong during the many actions
> within panic(), it makes sense to flush in between those actions.
>
> > Otherwise I see only negatives => IMHO, we want to flush atomic
> > consoles synchronously from printk() in panic().
> >
> > Does anyone really want explicit flushes in panic()?
>
> So far you are the only one speaking against it. I expect as time goes
> on it will get even more complex as it becomes tunable (also something
> we talked about during the demo).
Flush consoles in panic kexec case sounds not good, but I have no
deep understanding about the atomic printk series, added kexec list and
reviewers in cc.
>
> John
>
Thanks
Dave
Powered by blists - more mailing lists