[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150611145547.GA3234@dhcp128.suse.cz>
Date: Thu, 11 Jun 2015 16:55:47 +0200
From: Petr Mladek <pmladek@...e.cz>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Steven Rostedt <rostedt@...dmis.org>, linux-kernel@...r.kernel.org,
jkosina@...e.cz, paulmck@...ux.vnet.ibm.com,
Ingo Molnar <mingo@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [RFC][PATCH] printk: Fixup the nmi printk mess
On Wed 2015-06-10 21:23:04, Peter Zijlstra wrote:
> Below a version which does x-cpu stuff to allow the
> trigger_all*_cpu_backtrace() initiator to flush buffers on behalf of
> other CPUs.
>
> Compile tested only.
The output from "echo l >/proc/sysrq-trigger" looks reasonable.
It does not mix output from different CPUs. This is expected
because of the @lock.
The other observation is that it prints CPUs in _random_ order:
28, 24, 25, 1, 26, 2, 27, 3, ...
The order is fine when I disable the irq_work.
It means that irq_works are usually faster than printk_nmi_flush() =>
printk_nmi_flush() is not that useful => all the complexity with
the three atomic variables (head, tail, read) did not bring
much win.
Anyway, I think that the current solution is racy and it cannot be fixed
easily, see below.
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index c099b082cd02..99bfc1e3f32a 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1821,13 +1821,200 @@ int vprintk_default(const char *fmt, va_list args)
> +static void __printk_nmi_flush(struct irq_work *work)
> +{
> + static raw_spinlock_t lock = __RAW_SPIN_LOCK_INITIALIZER(lock);
> + struct nmi_seq_buf *s = container_of(work, struct nmi_seq_buf, work);
> + int len, head, size, i, last_i;
> +
> +again:
> + /*
> + * vprintk_nmi() truncate
> + *
> + * [S] head [S] head
> + * wmb mb
> + * [S] tail [S] read
BTW, this is quite cryptic for me. Coffee did not help :-)
*
> + * therefore:
> + */
> + i = atomic_read(&s->read);
> + len = atomic_read(&s->tail); /* up to the tail is stable */
> + smp_rmb();
> + head = atomic_read(&s->head);
> +
> + /*
> + * We cannot truncate tail because it could overwrite a store from
> + * vprintk_nmi(), however vprintk_nmi() will always update tail to the
> + * correct value.
> + *
> + * Therefore if head < tail, we missed a truncate and should do so now.
> + */
> + if (head < len)
> + len = 0;
This is a bit confusing. It is a complicated way how to return on the next test.
If I get this correctly. This might happen only inside
_printk_nmi_flush() called on another CPU (from
printk_nmi_flush()) when it interferes with the queued
irq_work. The irq_work is faster and truncates the buffer.
So, the return is fine after all because the irq_work printed
everything.
> + if (len - i <= 0) /* nothing to do */
> + return;
> + /*
> + * 'Consume' this chunk, avoids concurrent callers printing the same
> + * stuff.
> + */
> + if (atomic_cmpxchg(&s->read, i, len) != i)
> + goto again;
I think that this is racy:
CPU0 CPU7
printk_nmi_flush()
__printk_nmi_flush(for CPU7)
i = atomic_read(&s->read); (100)
len = atomic_read(&s->tail); (200)
head = atomic_read(&s->head); (200)
if (atomic_cmpxchg(&s->read, i, len) != i)
we pass but we get interrupted
or rescheduled on preemptive kernel
another vprintk_nmi()
leaves: head(400), tail(400)
__printk_nmi_flush() in irq_work
it prints string between 200-400
truncate buffer: head(0), read(0)
another vprintk_nmi()
returns: head(150), tail(150)
print string between (100-200) =>
part of the new and part of old message
and modifies @head and @read a wrong way
I think that such races are hard to avoid without indexing the printed
messages. But it would make the approach too complicated.
I think that ordering CPUs is not worth it. I would go back to the
first solution, add the @lock there, and double check races with
seq_buf().
I stop here with commenting the code for now.
Best Regards,
Petr
PS: I had two cups of coffee and hope that my comments are smaller fiasco
than yesterday.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists