lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 9 Nov 2017 21:07:15 +0900 From: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp> To: mhocko@...nel.org Cc: rostedt@...dmis.org, linux-kernel@...r.kernel.org, akpm@...ux-foundation.org, linux-mm@...ck.org, xiyou.wangcong@...il.com, dave.hansen@...el.com, hannes@...xchg.org, mgorman@...e.de, pmladek@...e.com, sergey.senozhatsky@...il.com, vbabka@...e.cz, peterz@...radead.org, torvalds@...ux-foundation.org, jack@...e.cz, mathieu.desnoyers@...icios.com, rostedt@...e.goodmis.org Subject: Re: [PATCH v4] printk: Add console owner and waiter logic to loadbalance console writes Michal Hocko wrote: > On Thu 09-11-17 20:03:30, Tetsuo Handa wrote: > > Michal Hocko wrote: > > > On Thu 09-11-17 19:22:58, Tetsuo Handa wrote: > > > > Michal Hocko wrote: > > > > > Hi, > > > > > assuming that this passes warn stall torturing by Tetsuo, do you think > > > > > we can drop http://lkml.kernel.org/r/1509017339-4802-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp > > > > > from the mmotm tree? > > > > > > > > I don't think so. > > > > > > > > The rule that "do not try to printk() faster than the kernel can write to > > > > consoles" will remain no matter how printk() changes. Unless asynchronous > > > > approach like https://lwn.net/Articles/723447/ is used, I think we can't > > > > obtain useful information. > > > > > > Does that mean that the patch doesn't pass your test? > > > > > > > Test is irrelevant. See the changelog. > > > > Synchronous approach is prone to unexpected results (e.g. too late [1], too > > frequent [2], overlooked [3]). As far as I know, warn_alloc() never helped > > with providing information other than "something is going wrong". > > I want to consider asynchronous approach which can obtain information > > during stalls with possibly relevant threads (e.g. the owner of oom_lock > > and kswapd-like threads) and serve as a trigger for actions (e.g. turn > > on/off tracepoints, ask libvirt daemon to take a memory dump of stalling > > KVM guest for diagnostic purpose). > > > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=192981 > > [2] http://lkml.kernel.org/r/CAM_iQpWuPVGc2ky8M-9yukECtS+zKjiDasNymX7rMcBjBFyM_A@mail.gmail.com > > [3] commit db73ee0d46379922 ("mm, vmscan: do not loop on too_many_isolated for ever") > > So you want to keep the warning out of the kernel even though the > problems you are seeing are gone just to allow for an async approach > nobody is very fond of? That is a very dubious approach. You are assuming that there are no more bugs which will be caught by an async approach. That is seriously wrong. [3] is just an example. http://lkml.kernel.org/r/CABXGCsOzaorL0wKZFYRFKR7RSnUL+7=vspE36sFTENoimsJGSw@mail.gmail.com is an example where async approach will help. For example, turn various tracepoints on if stall lasted for 5 seconds and then turn them off when stall disappeared. It is very unfortunate that we still do not have such trigger.
Powered by blists - more mailing lists