linux-kernel - Re: [RFC PATCH] memcg, oom: throttle dump_header for memcg ooms without eligible tasks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181015133524.GM18839@dhcp22.suse.cz>
Date:   Mon, 15 Oct 2018 15:35:24 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc:     Johannes Weiner <hannes@...xchg.org>, linux-mm@...ck.org,
        syzkaller-bugs@...glegroups.com, guro@...com,
        kirill.shutemov@...ux.intel.com, linux-kernel@...r.kernel.org,
        rientjes@...gle.com, yang.s@...baba-inc.com,
        Andrew Morton <akpm@...ux-foundation.org>,
        Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        Petr Mladek <pmladek@...e.com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [RFC PATCH] memcg, oom: throttle dump_header for memcg ooms
 without eligible tasks

On Mon 15-10-18 21:47:08, Tetsuo Handa wrote:
> On 2018/10/15 20:24, Michal Hocko wrote:
> > On Mon 15-10-18 19:57:35, Tetsuo Handa wrote:
> >> On 2018/10/15 17:19, Michal Hocko wrote:
> >>> As so many dozens of times before, I will point you to an incremental
> >>> nature of changes we really prefer in the mm land. We are also after a
> >>> simplicity which your proposal lacks in many aspects. You seem to ignore
> >>> that general approach and I have hard time to consider your NAK as a
> >>> relevant feedback. Going to an extreme and basing a complex solution on
> >>> it is not going to fly. No killable process should be a rare event which
> >>> requires a seriously misconfigured memcg to happen so wildly. If you can
> >>> trigger it with a normal user privileges then it would be a clear bug to
> >>> address rather than work around with printk throttling.
> >>>
> >>
> >> I can trigger 200+ times / 900+ lines / 69KB+ of needless OOM messages
> >> with a normal user privileges. This is a lot of needless noise/delay.
> > 
> > I am pretty sure you have understood the part of my message you have
> > chosen to not quote where I have said that the specific rate limitting
> > decisions can be changed based on reasonable configurations. There is
> > absolutely zero reason to NAK a natural decision to unify the throttling
> > and cook a per-memcg way for a very specific path instead.
> > 
> >> No killable process is not a rare event, even without root privileges.
> >>
> >> [root@...ecurity kumaneko]# time ./a.out
> >> Killed
> >>
> >> real    0m2.396s
> >> user    0m0.000s
> >> sys     0m2.970s
> >> [root@...ecurity ~]# dmesg | grep 'no killable' | wc -l
> >> 202
> >> [root@...ecurity ~]# dmesg | wc
> >>     942    7335   70716
> > 
> > OK, so this is 70kB worth of data pushed throug the console. Is this
> > really killing any machine?
> > 
> 
> Nobody can prove that it never kills some machine. This is just one example result of
> one example stress tried in my environment. Since I am secure programming man from security
> subsystem, I really hate your "Can you trigger it?" resistance. Since this is OOM path
> where nobody tests, starting from being prepared for the worst case keeps things simple.

There is simply no way to be generally safe this kind of situation. As
soon as your console is so slow that you cannot push the oom report
through there is only one single option left and that is to disable the
oom report altogether. And that might be a viable option. But fiddling
with per memcg limit is not going to fly. Just realize what will happen
if you have hundreds of different memcgs triggering this path around the
same time.

So can you start being reasonable and try to look at a wider picture
finally please?

-- 
Michal Hocko
SUSE Labs