[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGFReeOKwvsUb5hMMXLY3eTdqMDOae9mNjfp5BXx+BYfn5vMiw@mail.gmail.com>
Date: Thu, 28 Jan 2021 17:46:03 -0800
From: "J. Avila" <elavila@...gle.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Peter Zijlstra <peterz@...radead.org>,
Petr Mladek <pmladek@...e.com>,
Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Andrea Parri <parri.andrea@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Paul McKenney <paulmck@...nel.org>,
Saravana Kannan <saravanak@...gle.com>,
kexec@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: Issue in dmesg time with lockless ring buffer
Hello John,
I’ve done some additional digging on my end. I tested using a 5.10.11
kernel and observed the following:
1) With the default of CONFIG_LOG_BUF_SHIFT=17, I was not able to reproduce
the issue.
2) With CONFIG_LOG_BUF_SHIFT=20, I was able to reproduce the behavior
mentioned before.
3) With (2) + reverting up to and including 896fbe20b4e2 (printk: use the
lockless ringbuffer), I saw short dmesg times again.
It seems that this issue may only exist with a sufficiently big log buffer
size. Despite 1MB being a relatively uncommon size for linux kernel log
buffers, this still indicates a potential issue in the code; do you think
it's worth investigation?
Thanks,
Avila
On Mon, Jan 25, 2021 at 4:00 PM J. Avila <elavila@...gle.com> wrote:
>
> Hello,
>
> This dmesg uses /dev/kmsg; we've verified that we don't see this long
> dmesg time when reading from syslog (via dmesg -S).
>
> We've also tried testing this with logging daemons disabled as well as
> within initrd - both result in similar behavior.
>
> If it's relevant, this was done on a toybox shell.
>
> Thanks,
>
> Avila
>
> On Mon, Jan 25, 2021 at 5:32 AM John Ogness <john.ogness@...utronix.de> wrote:
> >
> > On 2021-01-22, "J. Avila" <elavila@...gle.com> wrote:
> > > When doing some internal testing on a 5.10.4 kernel, we found that the
> > > time taken for dmesg seemed to increase from the order of milliseconds
> > > to the order of seconds when the dmesg size approached the ~1.2MB
> > > limit. After doing some digging, we found that by reverting all of the
> > > patches in printk/ up to and including
> > > 896fbe20b4e2333fb55cc9b9b783ebcc49eee7c7 ("use the lockless
> > > ringbuffer"), we were able to once more see normal dmesg times.
> > >
> > > This kernel had no meaningful diffs in the printk/ dir when compared
> > > to Linus' tree. This behavior was consistently reproducible using the
> > > following steps:
> > >
> > > 1) In one shell, run "time dmesg > /dev/null"
> > > 2) In another, constantly write to /dev/kmsg
> > >
> > > Within ~5 minutes, we saw that dmesg times increased to 1 second, only
> > > increasing further from there. Is this a known issue?
> >
> > The last couple days I have tried to reproduce this issue with no
> > success.
> >
> > Is your dmesg using /dev/kmsg or syslog() to read the buffer?
> >
> > Are there any syslog daemons or systemd running? Perhaps you can run
> > your test within an initrd to see if this effect is still visible?
> >
> > John Ogness
Powered by blists - more mailing lists