[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171222000932.GG1084507@devbig577.frc2.facebook.com>
Date: Thu, 21 Dec 2017 16:09:32 -0800
From: Tejun Heo <tj@...nel.org>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Petr Mladek <pmladek@...e.com>,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
Jan Kara <jack@...e.cz>,
Andrew Morton <akpm@...ux-foundation.org>,
Peter Zijlstra <peterz@...radead.org>,
Rafael Wysocki <rjw@...ysocki.net>,
Pavel Machek <pavel@....cz>,
Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
linux-kernel@...r.kernel.org,
Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Subject: Re: [RFC][PATCHv6 00/12] printk: introduce printing kernel thread
Hello,
Sorry about the long delay.
On Thu, Dec 14, 2017 at 01:21:09PM -0500, Steven Rostedt wrote:
> > Yeah, will do, but out of curiosity, Sergey and I already described
> > what the root problem was and you didn't really seem to take that. Is
> > that because the explanation didn't make sense to you or us
> > misunderstanding what your code does?
>
> Can you post the message id of the discussion you are referencing.
> Because I've been swamped with other activities and only been skimming
> these threads.
This was already on another reply but just in case.
http://lkml.kernel.org/r/20171108162813.GA983427@devbig577.frc2.facebook.com
I tried your v4 patch and ran the test module and could easily
reproduce RCU stall and other issues stemming from a CPU getting
pegged down by printk flushing. I'm attaching the test module code at
the end.
I wrote this before but this isn't a theoretical problem. We see
these stalls a lot. Preemption isn't enabled to begin with. Memory
pressure is high and OOM triggers and printk starts printing out OOM
warnings; then, a network packet comes in which triggers allocations
in the network layer, which fails due to memory pressure, which then
generates memory allocation failure messages which then generates
netconsole packets which then tries to allocate more memory and so on.
It's just that there's no one else to give that flushing duty too, so
the pingpoinging that your patch implements can't really help
anything.
You argue that it isn't made worse by your patch, which may be true
but your patch doesn't solve actual problems and is most likely
unnecessary complication which gets in the way for the actual
solution. It's a weird argument to push an approach which is
fundamentally broken. Let's please stop that.
Thanks.
#include <linux/module.h>
#include <linux/delay.h>
#include <linux/sched.h>
#include <linux/mutex.h>
#include <linux/workqueue.h>
#include <linux/hrtimer.h>
static bool in_printk;
static bool stop_testing;
static struct hrtimer printk_timer;
static ktime_t timer_interval;
static enum hrtimer_restart printk_timerfn(struct hrtimer *timer)
{
int i;
if (READ_ONCE(in_printk))
for (i = 0; i < 10000; i++)
printk("%-80s\n", "XXX TIMER");
hrtimer_forward_now(&printk_timer, timer_interval);
return READ_ONCE(stop_testing) ? HRTIMER_NORESTART : HRTIMER_RESTART;
}
static void preempt_printk_workfn(struct work_struct *work)
{
int i;
hrtimer_init(&printk_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
printk_timer.function = printk_timerfn;
timer_interval = ktime_set(0, NSEC_PER_MSEC);
hrtimer_start(&printk_timer, timer_interval, HRTIMER_MODE_REL);
while (!READ_ONCE(stop_testing)) {
preempt_disable();
WRITE_ONCE(in_printk, true);
for (i = 0; i < 100; i++)
printk("%-80s\n", "XXX PREEMPT");
WRITE_ONCE(in_printk, false);
preempt_enable();
msleep(1);
}
}
static DECLARE_WORK(preempt_printk_work, preempt_printk_workfn);
static int __init test_init(void)
{
queue_work_on(0, system_wq, &preempt_printk_work);
return 0;
}
static void __exit test_exit(void)
{
WRITE_ONCE(stop_testing, true);
flush_work(&preempt_printk_work);
hrtimer_cancel(&printk_timer);
}
module_init(test_init);
module_exit(test_exit);
MODULE_LICENSE("GPL");
Powered by blists - more mailing lists