linux-kernel - Re: [PATCH] mm: vmpressure: don't count userspace-induced reclaim as memory pressure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJD7tkYemNQqu_O2nYG3cqxPWGELvc6Lh5i+KKNCtv6cgSPmdA@mail.gmail.com>
Date:   Mon, 27 Jun 2022 10:03:53 -0700
From:   Yosry Ahmed <yosryahmed@...gle.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Shakeel Butt <shakeelb@...gle.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Muchun Song <songmuchun@...edance.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Matthew Wilcox <willy@...radead.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        David Hildenbrand <david@...hat.com>,
        Miaohe Lin <linmiaohe@...wei.com>, NeilBrown <neilb@...e.de>,
        Alistair Popple <apopple@...dia.com>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Peter Xu <peterx@...hat.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Cgroups <cgroups@...r.kernel.org>, Linux-MM <linux-mm@...ck.org>
Subject: Re: [PATCH] mm: vmpressure: don't count userspace-induced reclaim as
 memory pressure

On Mon, Jun 27, 2022 at 5:31 AM Michal Hocko <mhocko@...e.com> wrote:
>
> On Mon 27-06-22 02:39:49, Yosry Ahmed wrote:
> [...]
> > (a) Do not count vmpressure for mem_cgroup_resize_max() and
> > mem_cgroup_force_empty() in v1.
>
> yes, unless you have a very good reason to change that. E.g. this has
> been buggy and we have finally understood that. But I do not see any
> indications so far.

I don't have any bug reports. It makes sense that users do not expect
vmpressure notifications when they resize the limits below the current
usage, because it should be expected that reclaim will happen so
receiving notifications here is redundant, and may be incorrectly
perceived by a different user space thread as being under memory
pressure. But I get your point that what the user sees as memory
pressure or not could be different, and is probably already defined by
the current behavior anyway, whether it makes sense or not.

I can also see some userspace applications depending on this behavior
in some way, either by handling that limit resize notification in a
certain way or deliberately dropping it. Either way, making this
change could throw them off. I don't expect any userspace applications
to crash of course (because there are cases where they won't receive
notifications, e.g. scanned < vmpressure_win), but perhaps it's not
worth even risk misguiding them.

So I agree that just because it doesn't make sense or is inconsistent
with other definitions of behavior then we can make a visible change
for userspace. I will drop the v1 changes in the next version anyway.

Thanks!

>
> > (b) Do not count vmpressure (consequently,
> > mem_cgroup_under_socket_pressure()) in v2 where psi is not counted
> > (writing to memory.max, memory.high, and memory.reclaim).
>
> I can see clear arguments for memory.reclaim opt out for vmpressure
> because we have established that this is not a measure to express a
> memory pressure on the cgroup.
>
> Max/High are less clear to me, TBH. I do understand reasoning for PSI
> exclusion because considering the calling process to be stalled and
> non-productive is misleading. It just does its work so in a way it is
> a productive time in the end. For the vmpressure, which measures how
> hard/easy it is to reclaim memory why this should special for this
> particular reclaim?
>
> Again, an explanation of the effect on the socket pressure could give a
> better picture. Say that I somebody reduces the limit (hard/high) and it
> takes quite some effort to shrink the consumption down. Should the
> networking layer react to that in any way or should it wait for the
> active allocation during that process to find that out?

I am out of my depth here. Any answer on my side would be purely
speculation at this point. Shakeel, can you help us here or tag some
networking people?
Thanks!

> --
> Michal Hocko
> SUSE Labs