lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 20 Dec 2017 21:06:26 +0900
From:   Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To:     sergey.senozhatsky.work@...il.com, mhocko@...nel.org
Cc:     rostedt@...dmis.org, pmladek@...e.com, tj@...nel.org,
        sergey.senozhatsky@...il.com, jack@...e.cz,
        akpm@...ux-foundation.org, peterz@...radead.org, rjw@...ysocki.net,
        pavel@....cz, linux-kernel@...r.kernel.org
Subject: Re: [RFC][PATCHv6 00/12] printk: introduce printing kernel thread

Sergey Senozhatsky wrote:
> Steven said that this scenario is possible, but is not of any particular
> interest, because printk from IRQ or from any other atomic context is a
> bad thing, which should happen only when something wrong is going on in
> the system. but we are in OOM or has just returned from the OOM. which _is_
> "something bad going on", isn't it? can we instead say - OOM makes that
> printk from atomic context more likely? if it does happen, will there be
> non-atomic printk-s to take over printing from atomic CPUz? we can't tell.
> I don't know much about Tetsuo's test, but I assume that his VM does not
> have any networking activities during the test. I probably wouldn't be so
> surprised to see a bunch of printk-s from atomic contexts under OOM.

I'm using VMware Workstation Player, and my VM does not have any network
activity other than ssh login session. Fortunately, VMware's serial console
(written to host's file) is reliable enough to allow console=ttyS0,115200n8
configuration. But there is a virtualization software where serial console is
so weak that I have to choose netconsole instead. Also, there are enterprise
servers where very slow configuration (e.g. 1200 or 9600) has to be used for
serial console because serial device is emulated using system management
interrupts instead of using real hardware. Therefore, while it is true that
any approach would survive my environment, it is dangerous to assume that any
approach is safe for my customer's enterprise servers.

Thanks for summarizing the pointers. The safest way for not overflowing
printk() will be to use mutex_lock(&oom_lock) at __alloc_pagesmay_oom() (and
yield the CPU resource to the thread flushing the logbuf), but so far we
have not came to agreement. Fortunately, since warn_alloc() for reporting
allocation stall was killed in 4.15-rc1, the risk of overflowing printk()
under OOM was reduced a lot. But yes, since my VM has little network
activity, printk() flooding due to allocation failure might happen in
different VMs.

Anyway, the rule that "do not try to printk() faster than the kernel can
write to consoles" will remain no matter how printk() changes. I think that
any printk() users has to be careful not to waste CPU resource. MM's direct
reclaim + back off combination is a user who really love to waste CPU resource
while someone is printk()ing.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ