linux-kernel - Re: [PATCH] fs, mm: account filp and names caches to kmemcg

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20171025172924.i7du5wnkeihx2fgl@dhcp22.suse.cz>
Date:   Wed, 25 Oct 2017 19:29:24 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Johannes Weiner <hannes@...xchg.org>
Cc:     Greg Thelen <gthelen@...gle.com>,
        Shakeel Butt <shakeelb@...gle.com>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux MM <linux-mm@...ck.org>, linux-fsdevel@...r.kernel.org,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] fs, mm: account filp and names caches to kmemcg

On Wed 25-10-17 12:44:02, Johannes Weiner wrote:
> On Wed, Oct 25, 2017 at 04:12:21PM +0200, Michal Hocko wrote:
[...]

I yet have to digest the first path of the email but the remaining
just sounds we are not on the same page.

> > So how about we start with a BIG FAT WARNING for the failure case?
> > Something resembling warn_alloc for the failure case.
> >
> > ---
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 5d9323028870..3ba62c73eee5 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -1547,9 +1547,14 @@ static bool mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int order)
> >  	 * victim and then we have rely on mem_cgroup_oom_synchronize otherwise
> >  	 * we would fall back to the global oom killer in pagefault_out_of_memory
> >  	 */
> > -	if (!memcg->oom_kill_disable &&
> > -			mem_cgroup_out_of_memory(memcg, mask, order))
> > -		return true;
> > +	if (!memcg->oom_kill_disable) {
> > +		if (mem_cgroup_out_of_memory(memcg, mask, order))
> > +			return true;
> > +
> > +		WARN(!current->memcg_may_oom,
> > +				"Memory cgroup charge failed because of no reclaimable memory! "
> > +				"This looks like a misconfiguration or a kernel bug.");
> > +	}
> 
> That's crazy!
> 
> We shouldn't create interfaces that make it possible to accidentally
> livelock the kernel. Then warn about it and let it crash. That is a
> DOS-level lack of OS abstraction.
> 
> In such a situation, we should ignore oom_score_adj or ignore the hard
> limit. Even panic() would be better from a machine management point of
> view than leaving random tasks inside infinite loops.
> 
> Why is OOM-disabling a thing? Why isn't this simply a "kill everything
> else before you kill me"? It's crashing the kernel in trying to
> protect a userspace application. How is that not insane?

I really do not follow. What kind of livelock or crash are you talking
about. All this code does is that the charge request (which is not
explicitly GFP_NOFAIL) fails with ENOMEM if the oom killer is not able
to make a forward progress. That sounds like a safer option than failing
with ENOMEM unconditionally which is what we do currently. So the only
change I am really proposing is to keep retrying as long as the oom
killer makes a forward progress and ENOMEM otherwise.

I am also not trying to protect an userspace application. Quite
contrary, I would like the application gets ENOMEM when it should run
away from the constraint it runs within. I am protecting everything
outside of the hard limited memcg actually.

So what is that I am missing?
-- 
Michal Hocko
SUSE Labs