linux-kernel - Re: [PATCH v2 3/3] mm: memcontrol: recursive memory.low protection

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200213175813.GA216470@cmpxchg.org>
Date:   Thu, 13 Feb 2020 12:58:13 -0500
From:   Johannes Weiner <hannes@...xchg.org>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Roman Gushchin <guro@...com>, Tejun Heo <tj@...nel.org>,
        linux-mm@...ck.org, cgroups@...r.kernel.org,
        linux-kernel@...r.kernel.org, kernel-team@...com
Subject: Re: [PATCH v2 3/3] mm: memcontrol: recursive memory.low protection

On Thu, Feb 13, 2020 at 12:41:36PM -0500, Johannes Weiner wrote:
> On Thu, Feb 13, 2020 at 04:46:27PM +0100, Michal Hocko wrote:
> > On Thu 13-02-20 08:23:17, Johannes Weiner wrote:
> > > On Thu, Feb 13, 2020 at 08:40:49AM +0100, Michal Hocko wrote:
> > > > On Wed 12-02-20 12:08:26, Johannes Weiner wrote:
> > > > > On Tue, Feb 11, 2020 at 05:47:53PM +0100, Michal Hocko wrote:
> > > > > > Unless I am missing something then I am afraid it doesn't. Say you have a
> > > > > > default systemd cgroup deployment (aka deeper cgroup hierarchy with
> > > > > > slices and scopes) and now you want to grant a reclaim protection on a
> > > > > > leaf cgroup (or even a whole slice that is not really important). All the
> > > > > > hierarchy up the tree has the protection set to 0 by default, right? You
> > > > > > simply cannot get that protection. You would need to configure the
> > > > > > protection up the hierarchy and that is really cumbersome.
> > > > > 
> > > > > Okay, I think I know what you mean. Let's say you have a tree like
> > > > > this:
> > > > > 
> > > > >                           A
> > > > >                          / \
> > > > >                         B1  B2
> > > > >                        / \   \
> > > > >                       C1 C2   C3

> > > > So let's see how that works in practice, say a multi workload setup
> > > > with a complex/deep cgroup hierachies (e.g. your above example). No
> > > > delegation point this time.
> > > > 
> > > > C1 asks for low=1G while using 500M, C3 low=100M using 80M.  B1 and
> > > > B2 are completely independent workloads and the same applies to C2 which
> > > > doesn't ask for any protection at all? C2 uses 100M. Now the admin has
> > > > to propagate protection upwards so B1 low=1G, B2 low=100M and A low=1G,
> > > > right? Let's say we have a global reclaim due to external pressure that
> > > > originates from outside of A hierarchy (it is not overcommited on the
> > > > protection).
> > > > 
> > > > Unless I miss something C2 would get a protection even though nobody
> > > > asked for it.
> > > 
> > > Good observation, but I think you spotted an unintentional side effect
> > > of how I implemented the "floating protection" calculation rather than
> > > a design problem.
> > > 
> > > My patch still allows explicit downward propagation. So if B1 sets up
> > > 1G, and C1 explicitly claims those 1G (low>=1G, usage>=1G), C2 does
> > > NOT get any protection. There is no "floating" protection left in B1
> > > that could get to C2.
> > 
> > Yeah, the saturated protection works reasonably AFAICS.
> 
> Hm, Tejun raises a good point though: even if you could direct memory
> protection down to one targeted leaf, you can't do the same with IO or
> CPU. Those follow non-conserving weight distribution, and whatever you

                    "work-conserving", obviously.

> allocate to a certain level is available at that level - if one child
> doesn't consume it, the other children can.
> 
> And we know that controlling memory without controlling IO doesn't
> really work in practice. The sibling with less memory allowance will
> just page more.
> 
> So the question becomes: is this even a legit usecase? If every other
> resource is distributed on a level-by-level method already, does it
> buy us anything to make memory work differently?