linux-kernel - Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Wed, 10 Jan 2018 18:40:40 +0000 (UTC)
From:   Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:     Tejun Heo <tj@...nel.org>
Cc:     Petr Mladek <pmladek@...e.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        rostedt <rostedt@...dmis.org>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        linux-mm <linux-mm@...ck.org>,
        Cong Wang <xiyou.wangcong@...il.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Mel Gorman <mgorman@...e.de>, Michal Hocko <mhocko@...nel.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Peter Zijlstra <peterz@...radead.org>, Jan Kara <jack@...e.cz>,
        Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        rostedt@...e.goodmis.org, Byungchul Park <byungchul.park@....com>,
        Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        Pavel Machek <pavel@....cz>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

----- On Jan 10, 2018, at 12:02 PM, Tejun Heo tj@...nel.org wrote:

> Hello, Linus, Andrew.
> 
> On Wed, Jan 10, 2018 at 05:29:00PM +0100, Petr Mladek wrote:
>> Where is the acceptable compromise? I am not sure. So far, the most
>> forceful people (Linus) did not see softlockups as a big problem.
>> They rather wanted to see the messages.
> 
> Can you please chime in?  Would you be opposed to offloading to an
> independent context even if it were only for cases where we were
> already punting?  The thing with the current offloading is that we
> don't know who we're offloading to.  It might end up in faster or
> slower context, or more importantly a dangerous one.
> 
> The particular case that we've been seeing regularly in the fleet was
> the following scenario.
> 
> 1. Console is IPMI emulated serial console.  Super slow.  Also
>   netconsole is in use.
> 2. System runs out of memory, OOM triggers.
> 3. OOM handler is printing out OOM debug info.
> 4. While trying to emit the messages for netconsole, the network stack
>   / driver tries to allocate memory and then fail, which in turn
>   triggers allocation failure or other warning messages.  printk was
>   already flushing, so the messages are queued on the ring.
> 5. OOM handler keeps flushing but 4 repeats and the queue is never
>   shrinking.  Because OOM handler is trapped in printk flushing, it
>   never manages to free memory and no one else can enter OOM path
>   either, so the system is trapped in this state.

Hi Tejun,

There appears to be two problems at hand. One is making sure a console
buffer owner only flushes a bounded amount of data. Steven&Co patches
seem to address this.

The second problem you describe here appears to be related to the
side-effects of console drivers, namely netconsole in this scenario.
Its use of the network stack can allocate memory, which can fail, and
therefore trigger more printk. Having a way to detect that code is
directly called from a printk driver, and making sure error handling
is _not_ done by pushing more printk messages to that printk driver in
those situations comes to mind as a possible solution.

The problem you describe seems to be _another_ issue of the current
printk implementation which Steven's approach does not address, but
I don't think that Steven's changes prevent doing further improvements
on the netconsole driver front.

I also don't see what's wrong in the incremental approach proposed by
Steven. Even though it does not fix your console driver problem, his
patchset appears to address some real-world latency issues.

Thanks,

Mathieu

> 
> The system usually never recovers in time once this sort of condition
> hits and the following was the patch that I suggested which only punts
> when messages are already being punted and we can easily make it less
> punty by delaying the punting by N messages.
> 
> http://lkml.kernel.org/r/20171102135258.GO3252168@devbig577.frc2.facebook.com
> 
> We definitely can fix the above described case by e.g. preventing
> printk flushing task from queueing more messages or whatever, but it
> just seems really dumb for the system to die from things like this in
> general and it doesn't really take all that much to trigger the
> condition.
> 
> Thanks.
> 
> --
> tejun

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com