linux-kernel - Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Ynkum8DeJIAtGi9y@cmpxchg.org>
Date:   Mon, 9 May 2022 11:09:15 -0400
From:   Johannes Weiner <hannes@...xchg.org>
To:     David Vernet <void@...ifault.com>
Cc:     Michal Koutný <mkoutny@...e.com>,
        akpm@...ux-foundation.org, tj@...nel.org, roman.gushchin@...ux.dev,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        cgroups@...r.kernel.org, mhocko@...nel.org, shakeelb@...gle.com,
        kernel-team@...com, Richard Palethorpe <rpalethorpe@...e.com>
Subject: Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in
 test_memcg_low()

On Fri, May 06, 2022 at 09:40:15AM -0700, David Vernet wrote:
> Sorry for the delayed reply, Michal. I've been at LSFMM this week.
> 
> On Fri, Apr 29, 2022 at 11:26:20AM +0200, Michal Koutný wrote:
> > I still think that the behavior when there's no protection left for the
> > memory.low == 0 child, there should be no memory.low events (not just
> > uncounted but not happening) and test should not accept this (even
> > though it's the current behavior).
>
> That's fair. I think part of the problem here is that in general, the
> memcontroller itself is quite heuristic, so it's tough to write tests that
> provide useful coverage while also being sufficiently flexible to avoid
> flakiness and over-prescribing expected behavior. In this case I think it's
> probably correct that the memory.low == 0 child shouldn't inherit
> protection from its parent under any circumstances due to its siblings
> overcommitting the parent's protection, but I also wonder if it's really
> necessary to enforce that. If you look at how much memory A/B/E gets at the
> end of the reclaim, it's still far less than 1MB (though should it be 0?).
> I'd be curious to hear what Johannes thinks.

We need to distinguish between what the siblings declare and what they
consume.

My understanding of the issue you're raising, Michal, is that
protected siblings start with current > low, then get reclaimed
slightly too much and end up with current < low. This results in a
tiny bit of float that then gets assigned to the low=0 sibling; when
that sibling gets reclaimed regardless, it sees a low event. Correct
me if I missed a detail or nuance here.

But unused float going to siblings is intentional. This is documented
in point 3 in the comment above effective_protection(): if you use
less than you're legitimately claiming, the float goes to your
siblings. So the problem doesn't seem to be with low accounting and
event generation, but rather it's simply overreclaim.

It's conceivable to make reclaim more precise and then tighten up the
test. But right now, David's patch looks correct to me.