[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170409101230.GB27363@amd>
Date: Sun, 9 Apr 2017 12:12:30 +0200
From: Pavel Machek <pavel@....cz>
To: Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
Jan Kara <jack@...e.cz>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Ye Xiaolong <xiaolong.ye@...el.com>,
Steven Rostedt <rostedt@...dmis.org>,
Petr Mladek <pmladek@...e.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Peter Zijlstra <peterz@...radead.org>,
"Rafael J . Wysocki" <rjw@...ysocki.net>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jiri Slaby <jslaby@...e.com>, Len Brown <len.brown@...el.com>,
linux-kernel@...r.kernel.org, lkp@...org
Subject: Re: [printk] fbc14616f4:
BUG:kernel_reboot-without-warning_in_test_stage
On Sat 2017-04-08 00:13:06, Sergey Senozhatsky wrote:
> On (04/07/17 14:44), Pavel Machek wrote:
> [..]
> > > [..]
> > > > I believe "spend at most 2 seconds in printk(), then print a warning
> > > > and offload" is a solution closer to what we had before.
> > >
> > > a warning here can be very noisy.
> >
> > Well, on normally-configured it should be ok. We don't commonly see
> > printk problems... If it is too noisy, perhaps we should increase from
> > 2 seconds, but I don't think it will be problem.
>
> we are looking at different typical setups :) serial console being 45
> seconds behind logbuf does not surprise me anymore.
>
> [..]
> > > what we have been thinking about is something like printk-stall detection.
> > > we probably (there are some if-s) can detect in printk() that offloading
> > > does not work and we must automatically switch to printk_emergency mode.
> > > that, in theory, can relax our dependency on printk_emergency_begin/end
> > > being in the right place at the right time. need to think more about it.
> >
> > So... I don't really like the begin/end interface. I would rather have
> > printk_emergency(KERN_ ...).
>
> you mean a single printk_emergency() switches printk to emergency mode
> or printk_emergency(KERN_ ... ) is a single message that must be printed
> in emergency mode?
The latter. Having state is ugly.
> printk() depends on console_trylock(). we can't expect printk_emergency(KERN_ ...)
> to always do more than just log_store().
>
> the idea behind begin/end interface is that you can do
>
> emergency_begin
> printk
> pr_cont
> pr_cont
> pr_cont
> printk
> dump_stack
> emergency_end
>
> with out the need of rewriting dump_stack() or anything else to use
> printk_emergency(). we, for example, do this in sysrq patch from this
> series.
Well.. I guess it is less work to include emergency_begin/end() but I
also believe result will state-less solution will be cleaner.
> > Second... I don't think "stuck detector" is that helpful. What I
> > usually seen was some rather innocent kernel message followed by
> > hard-lock. That's where "message delayed" is useful..
>
> a side note,
> that's rather unclear to me how would "message delayed" really help.
> if your system hard-lockup so badly and there are no printk messages
> even from NMI watchdog, then we won't be able to print that message.
We are talking about
printk("unusual condition");
do_something_clever(); /* Which unfortunately hard-crashes the machine */
that works with my proposal, but not with yours. Seen it happen many
times before.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Download attachment "signature.asc" of type "application/pgp-signature" (182 bytes)
Powered by blists - more mailing lists