[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200228031306.GO122464@google.com>
Date: Fri, 28 Feb 2020 12:13:06 +0900
From: Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
To: Lech Perczak <l.perczak@...lintechnologies.com>,
Petr Mladek <pmladek@...e.com>,
Steven Rostedt <rostedt@...dmis.org>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Theodore Ts'o <tytso@....edu>, Arnd Bergmann <arnd@...db.de>,
Krzysztof Drobiński
<k.drobinski@...lintechnologies.com>,
Pawel Lenkow <p.lenkow@...lintechnologies.com>,
John Ogness <john.ogness@...utronix.de>,
Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Subject: Re: Regression in v4.19.106 breaking waking up of readers of
/proc/kmsg and /dev/kmsg
Cc-ing Petr, Steven, John
https://lore.kernel.org/lkml/e9358218-98c9-2866-8f40-5955d093dc1b@camlintechnologies.com
On (20/02/27 14:08), Lech Perczak wrote:
> W dniu 27.02.2020 o 13:39, Lech Perczak pisze:
> > W dniu 27.02.2020 o 13:36, Greg Kroah-Hartman pisze:
> >> On Thu, Feb 27, 2020 at 11:09:49AM +0000, Lech Perczak wrote:
> >>> Hello,
> >>>
> >>> After upgrading kernel on our boards from v4.19.105 to v4.19.106 we found out that syslog fails to read the messages after ones read initially after opening /proc/kmsg just after booting.
> >>> I also found out, that output of 'dmesg --follow' also doesn't react on new printks appearing for whatever reason - to read new messages, reopening /proc/kmsg or /dev/kmsg was needed.
> >>> I bisected this down to commit 15341b1dd409749fa5625e4b632013b6ba81609b ("char/random: silence a lockdep splat with printk()"), and reverting it on top of v4.19.106 restored correct behaviour.
> >> That is really really odd.
> > Very odd it is indeed.
> >>> My test scenario for bisecting was:
> >>> 1. run 'dmesg --follow' as root
> >>> 2. run 'echo t > /proc/sysrq-trigger'
> >>> 3. If trace appears in dmesg output -> good, otherwise, bad. If trace doesn't appear in output of 'dmesg --follow', re-running it will show the trace.
> >>>
> >>> I ran my tests on Debian 10.3 with configuration based directly on one from 4.19.0-8-amd64 (4.19.98-1) in Qemu.
> >>> I could reproduce the same issue on several boards with x86 and ARMv7 CPUs alike, with 100% reproducibility.
> >>>
> >>> I haven't yet digged into why exactly this commit breaks notifications for readers of /proc/kmsg and /dev/kmsg, but as reverting it fixed the issue, I'm pretty sure this is the one. It is possible that the same happened in 5.4 line, bu I hadn't had a chance to test this as well yet.
> >> I can revert this, but it feels like there is something else going wrong
> >> here. Can you try the 5.4 tree to see if that too has your same
> >> problem?
> > Yes, I'll check it in a short while.
> I checked v5.4.22 just now and didn't observe the issue. Maybe this commit wasn't destined for 4.19.y due to intricacies of locking inside printk() :-)
>
> From my side, there is no need to rush with the revert, as I can do this locally and drop the revert with next rebase-to-upstream, which we do very often.
> OTOH, the issue is likely to affect a lot of users, especially ones using distros tracking this branch like Debian 10 mentioned earlier,
> once they pick it up the change, as the kernel log content recorded by syslog will be affected, and 'dmesg --follow' behaviour will be quite surprising.
This is very-very odd... Hmm.
Just out of curiosity, what happens if you comment out that
printk() entirely?
printk_deferred() should not affect the PRINTK_PENDING_WAKEUP path.
Either we never queue wakeup irq_work(), e.g. because waitqueue_active()
never lets us to do so or because `(curr_log_seq != log_next_seq)' is
always zero:
void wake_up_klogd(void)
{
preempt_disable();
if (waitqueue_active(&log_wait)) {
this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
}
preempt_enable();
}
int vprintk_emit()
{
...
pending_output = (curr_log_seq != log_next_seq);
...
if (pending_output)
wake_up_klogd()
}
Or we do wakeup, but then either `syslog_seq != log_next_seq' or
`user->seq != log_next_seq' fail.
Lech, any chance you can trace what's happening in the system?
-ss
Powered by blists - more mailing lists