lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 5 May 2020 11:27:12 -0400
From:   Johannes Weiner <hannes@...xchg.org>
To:     Shakeel Butt <shakeelb@...gle.com>
Cc:     Michal Hocko <mhocko@...nel.org>, Roman Gushchin <guro@...com>,
        Greg Thelen <gthelen@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux MM <linux-mm@...ck.org>,
        Cgroups <cgroups@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

On Mon, May 04, 2020 at 12:23:51PM -0700, Shakeel Butt wrote:
> On Mon, May 4, 2020 at 9:06 AM Michal Hocko <mhocko@...nel.org> wrote:
> > I really hate to repeat myself but this is no different from a regular
> > oom situation.
> 
> Conceptually yes there is no difference but there is no *divine
> restriction* to not make a difference if there is a real world
> use-case which would benefit from it.

I would wholeheartedly agree with this in general.

However, we're talking about the very semantics that set memory.max
apart from memory.high: triggering OOM kills to enforce the limit.

> > when the kernel cannot act and mentions that along with the
> > oom report so that whoever consumes that information can debug or act on
> > that fact.
> >
> > Silencing the oom report is simply removing a potentially useful
> > aid to debug further a potential problem.
> 
> *Potentially* useful for debugging versus actually beneficial for
> "sweep before tear down" use-case. Also I am not saying to make "no
> dumps for memory.max when no eligible tasks" a set in stone rule. We
> can always reevaluate when such information will actually be useful.
> 
> Johannes/Andrew, what's your opinion?

I still think that if you want to sweep without triggering OOMs,
memory.high has the matching semantics.

As you pointed out, it doesn't work well for foreign charges, but that
is more of a limitation in the implementation than in the semantics:

	/*
	 * If the hierarchy is above the normal consumption range, schedule
	 * reclaim on returning to userland.  We can perform reclaim here
	 * if __GFP_RECLAIM but let's always punt for simplicity and so that
	 * GFP_KERNEL can consistently be used during reclaim.  @memcg is
	 * not recorded as it most likely matches current's and won't
	 * change in the meantime.  As high limit is checked again before
	 * reclaim, the cost of mismatch is negligible.
	 */

Wouldn't it be more useful to fix that instead? It shouldn't be much
of a code change to do sync reclaim in try_charge().

Then you could express all things that you asked for without changing
any user-visible semantics: sweep an empty cgroup as well as possible,
do not oom on remaining charges that continue to be used by processes
outside the cgroup, do trigger oom on new foreign charges appearing
due to a misconfiguration.

	echo 0 > memory.high
	cat memory.current > memory.max

Would this work for you?

Powered by blists - more mailing lists