[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160824142729.GA579@swordfish>
Date: Wed, 24 Aug 2016 23:27:29 +0900
From: Sergey Senozhatsky <sergey.senozhatsky@...il.com>
To: Petr Mladek <pmladek@...e.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
Jan Kara <jack@...e.cz>, Kay Sievers <kay@...y.org>,
Tejun Heo <tj@...nel.org>, Calvin Owens <calvinowens@...com>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Subject: Re: [PATCH][RFC] printk: make pr_cont buffer per-cpu
On (08/24/16 10:19), Petr Mladek wrote:
> > On (08/23/16 13:47), Petr Mladek wrote:
> > [..]
> > > > if (!(lflags & LOG_NEWLINE)) {
> > > > + if (!this_cpu_read(cont_printing)) {
> > > > + if (system_state == SYSTEM_RUNNING) {
> > > > + this_cpu_write(cont_printing, true);
> > > > + preempt_disable();
> > > > + }
> > > > + }
> > >
> > > I am afraid that this is not acceptable. It means that printk() will have
> > > an unexpected side effect. The missing "\n" at the end of a printed
> > > string would disable preemption. See below for more.
> >
> > missing '\n' must WARN about "sched while atomic" eventually, so it
> > shouldn't go unnoticed or stay hidden.
>
> Well, it will still force people to rebuilt a test kernel because they
> forget to use '\n" and the test kernel is unusable.
you are right. misusage of printk() will now force user to go and fix
it. the kernel most likely will be rebuilt anyway - there is a missing
\n after all. and rebuild here is pretty much incremental because basically
only one line is getting updated (missing `\n' in printk() message). so
it's like 15 seconds, perhaps.
> IMHO, the connection between '\n' and preemption is not
> intuitive and hard to spot. We should do our best to avoid it.
yes. what can be done is a simple hint -- we are in preempt disabled
path when we report `sched while atomic', so it's safe to check if
this_cpu(cont).len != 0 and if it is then additionally report:
"incomplete cont line".
in normal life no one _probably_ would see the change. well, I saw broken
backtraces on my laptop before, but didn't consider it to be important.
for the last two days the problem is not theoretical anymore on my side,
as now I'm looking at actual pr_cont()-related bug report [internal].
I'm truing to move the whole thing into 'don't use the cont lines' direction;
there is a mix of numerous pr_cont() calls & missing new-lines [internal code].
the kernel, unfortunately, IMHO, was too nice in the latter case, flushing
incomplete cont buffer with LOG_NEWLINE. so many of those `accidental' cont
lines went unnoticed.
this "don't use the cont lines" is actually a bit tricky and doesn't
look so solid. there is a need for cont lines, and the kernel has cont
printk... but it's sort of broken, so people have to kinda work-around
it. not the best example, but still (arch/arm64/kernel/traps.c)
static void __dump_instr(const char *lvl, struct pt_regs *regs)
{
unsigned long addr = instruction_pointer(regs);
char str[sizeof("00000000 ") * 5 + 2 + 1], *p = str;
int i;
for (i = -4; i < 1; i++) {
unsigned int val, bad;
bad = __get_user(val, &((u32 *)addr)[i]);
if (!bad)
p += sprintf(p, i == 0 ? "(%08x) " : "%08x ", val);
else {
p += sprintf(p, "bad PC value");
break;
}
}
printk("%sCode: %s\n", lvl, str);
}
which is fine, but getting a bit boring once you have more than,
let's say, 20 function that want to do cont output.
[..]
> > well, I do understand what you mean and agree with it, but I'm
> > afraid pr_cont() kinda matters after all and people *probably*
> > expect it to be SMP safe (I'm not entirely sure whether all of
> > those pr_cont() calls were put there with the idea that the
> > output can be messed up and quite hard to read).
>
> This was even worse before the cont lines buffer.
>
> Sigh, I do not feel experienced enough to decide about this.
> I wonder if this is rather theoretical problem or if there
> are many real complains about it.
>
> I feel that we might be trapped by a perfectionism.
well. may be, not entirely, but may be :)
-ss
Powered by blists - more mailing lists