[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171228064858.GA892@jagdpanzerIV>
Date: Thu, 28 Dec 2017 15:48:58 +0900
From: Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Tejun Heo <tj@...nel.org>, Petr Mladek <pmladek@...e.com>,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
Jan Kara <jack@...e.cz>,
Andrew Morton <akpm@...ux-foundation.org>,
Peter Zijlstra <peterz@...radead.org>,
Rafael Wysocki <rjw@...ysocki.net>,
Pavel Machek <pavel@....cz>,
Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
linux-kernel@...r.kernel.org,
Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Subject: Re: [RFC][PATCHv6 00/12] printk: introduce printing kernel thread
Hello,
On (12/21/17 23:19), Steven Rostedt wrote:
[..]
> > I wrote this before but this isn't a theoretical problem. We see
> > these stalls a lot. Preemption isn't enabled to begin with. Memory
> > pressure is high and OOM triggers and printk starts printing out OOM
> > warnings; then, a network packet comes in which triggers allocations
> > in the network layer, which fails due to memory pressure, which then
> > generates memory allocation failure messages which then generates
> > netconsole packets which then tries to allocate more memory and so on.
>
> It doesn't matter if preemption is enabled or not. The hand off should
> happen either way.
preemption does matter. it matters so much, that it makes the first
thing your patch depends on to be questionable - if CPUA stuck
in console_unlock() printing other CPU's messages, it's
because there are currently printk()-s happening on CPUB-CPUZ.
which is not always true. with preemption enabled CPUA can stuck in
console_unlock() because printk()-s from CPUB-CPUZ could have happened
seconds or even minutes ago. I have demonstrated it with the traces; it
didn't convenience you. OK, I sat down and went trough Tetsuo's reports
starting from 2016:
https://marc.info/?l=linux-kernel&m=151375384500555
and Tetsuo has reported several times that printing CPU can sleep for
minutes with console_sem being locked; while other CPUs are happy to
printk()->log_store() as much as they want. and that's why I believe
that that "The hand off should happen either way" is dangerous, especially
when we hand off from a preemptible context to an atomic context. you
don't think that this is a problem, because your patch requires logbuf to
be small enough for any atomic context (irq, under spin_lock, etc) to be
able to print out all the pending messages to all registered consoles [we
print to consoles sequentially] within watchdog_threshold seconds (10
seconds by default). who does adjust the size of the logbuf in a way that
console_unlock() can print the entire logbuf under 10 seconds?
> > It's just that there's no one else to give that flushing duty too, so
> > the pingpoinging that your patch implements can't really help
> > anything.
> >
> > You argue that it isn't made worse by your patch, which may be true
> > but your patch doesn't solve actual problems and is most likely
> > unnecessary complication which gets in the way for the actual
> > solution. It's a weird argument to push an approach which is
> > fundamentally broken. Let's please stop that.
>
> BULLSHIT!
>
> It's not a complex solution, and coming from the cgroup and workqueue
> maintainer, that's a pretty ironic comment.
>
> It has already been proven that it can solve problems:
>
> http://lkml.kernel.org/r/20171219143147.GB15210@dhcp22.suse.cz
and Tetsuo also said that his test does not cover all the scenarios;
and he has sent us all a pretty clear warning:
: Therefore, while it is true that any approach would survive my
: environment, it is dangerous to assume that any approach is safe
: for my customer's enterprise servers.
https://marc.info/?l=linux-kernel&m=151377161705573
when it comes to lockups, printk() design is so flexible, that one can
justify nearly every patch no matter how many regressions it introduces,
by simply saying:
"but the same lockup scenario could have happened even before my
patch. it's just printk() was designed this way. you were lucky
enough not to hit the problem before; now you are less lucky".
been there, done that. it's a trap.
> You don't think handing off printks to an offloaded thread isn't more
> complex nor can it cause even more issues (like more likely to lose
> relevant information on kernel crashes)?
printk_kthread *used* to be way too independent. basically what we had
before was
bad_function()
printk()
vprintk_emit()
{
if (oops_in_progress)
can_offload_to_printk = false;
if (can_offload_to_printk)
wake_up(printk_kthread);
else if (console_trylock())
console_unlock();
}
the bad things:
- first, we do not take into account the fact that printk_kthread can
already be active.
- second, we do not take into account the fact that printk_kthread can
be way-way behind - a huge gap between `console_seq' and `log_next_seq'.
- third, we do not take into account the fact that printk_kthread can
be preempted under console_sem.
so, IOW, the CPU which was in trouble declared the "oh, we should not
offload to printk kthread" emergency only for itself, basically. the CPU
which printk_kthread was running on or sleeping on [preemption] did not
care, neither did printk_kthread.
what we have now:
- printk_kthread cannot be preempted under console_sem. if it acquires
the console_sem it stays RUNNING, printing the messages.
- printk_kthread does check for emergency on other CPUs. right before
it starts printing the next pending logbuf line (every line).
IOW, instead of that:
console_unlock()
{
again:
for (;;) {
call_console_drivers();
}
up();
if (more_pending_messages && down_trylock())
goto again;
}
it now does this:
console_unlock()
{
preempt_disable();
again:
for (;;) {
if (something_is_not_right)
break;
if (printing_for_too_long)
break;
call_console_drivers();
}
up();
preempt_enable();
if (!something_is_not_right && !printing_for_too_long &&
more_pending_messages && down_trylock())
goto again;
}
so if CPUA declares emergency (oops_in_progress, etc) then printk_kthread,
if it's running, will get out of CPUA's way as soon as possible. so I
don't think we are still *more likely* to lose relevant information on
kernel crashes because of printk_kthread.
and I'm actually thinking about returning back the old vprintk_emit()
behavior
vprintk_emit()
{
+ preempt_disable();
if (console_trylock())
console_unlock();
+ preempt_enable();
}
and letting console_unlock() to offload when needed. preemption in
vprintk_emit() is no good.
-ss
Powered by blists - more mailing lists