linux-kernel - Re: [RFC][PATCHv6 00/12] printk: introduce printing kernel thread

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20171218210356.71a1f60e@vmware.local.home>
Date:   Mon, 18 Dec 2017 21:03:56 -0500
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Cc:     Petr Mladek <pmladek@...e.com>, Tejun Heo <tj@...nel.org>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Jan Kara <jack@...e.cz>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Rafael Wysocki <rjw@...ysocki.net>,
        Pavel Machek <pavel@....cz>,
        Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC][PATCHv6 00/12] printk: introduce printing kernel thread

On Tue, 19 Dec 2017 10:24:55 +0900
Sergey Senozhatsky <sergey.senozhatsky.work@...il.com> wrote:

> On (12/18/17 20:08), Steven Rostedt wrote:
> > > ... do you guys read my emails? which part of the traces I have provided
> > > suggests that there is any improvement?  
> > 
> > The traces I've seen from you were from non-realistic scenarios.
> > But I have hit issues with printk()s happening that cause one CPU to do all
> > the work, where my patch would fix that. Those are the scenarios I'm
> > talking about.  
> 
> any hints about what makes your scenario more realistic than mine?
> to begin with, what was the scenario?

It was a while ago when I hit it. I think it was an OOM issue. And it
wasn't contrived. It happened on a production system.

> 
> [..]
> 
> > But I have hit issues with printk()s happening that cause one CPU to do all
> > the work, where my patch would fix that. Those are the scenarios I'm
> > talking about.  
> 
> and this is exactly what I'm still observing. i_do_printks-1992 stops
> printing, while console_sem is owned by another task. Since log_store()
> much faster than call_console_drivers() AND console_sem owner is getting
> preempted for unknown period of time, we end up having pending messages
> in logbuf... and it's kworker/0:1-135 that prints them all.
> 
>    systemd-udevd-671   [003] d..3    66.334866: offloading: set console_owner
>      kworker/0:1-135   [000] d..2    66.335999: offloading: vprintk_emit()->trylock FAIL  will spin? :1
>     i_do_printks-1992  [002] d..2    66.345474: offloading: vprintk_emit()->trylock FAIL  will spin? :0    x 1100
>    ...
>    systemd-udevd-671   [003] d..3    66.345917: offloading: clear console_owner  waiter != NULL :1

And kworker will still be bounded in what it can print. Yes it may end
up being the entire buffer, but that should not take longer than a
watchdog.

If that proves to be an issue in the real world, then we could simply
wake up an offloaded thread, if the current owner does more than one
iteration (more than what it wrote). Then when the thread wakes up, it
simply does a printk, and it will take over by the waiter logic.

But that is only if it still appears to be an issue.

-- Steve