linux-kernel - Re: [RFC PATCH] cgroup: introduce dynamic protection for memcg

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Mon, 4 Apr 2022 11:36:33 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     Zhaoyang Huang <huangzhaoyang@...il.com>
Cc:     Suren Baghdasaryan <surenb@...gle.com>,
        "zhaoyang.huang" <zhaoyang.huang@...soc.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        "open list:MEMORY MANAGEMENT" <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        cgroups mailinglist <cgroups@...r.kernel.org>,
        Ke Wang <ke.wang@...soc.com>
Subject: Re: [RFC PATCH] cgroup: introduce dynamic protection for memcg

On Mon 04-04-22 11:32:28, Michal Hocko wrote:
> On Mon 04-04-22 17:23:43, Zhaoyang Huang wrote:
> > On Mon, Apr 4, 2022 at 5:07 PM Zhaoyang Huang <huangzhaoyang@...il.com> wrote:
> > >
> > > On Mon, Apr 4, 2022 at 4:51 PM Michal Hocko <mhocko@...e.com> wrote:
> > > >
> > > > On Mon 04-04-22 10:33:58, Zhaoyang Huang wrote:
> > > > [...]
> > > > > > One thing that I don't understand in this approach is: why memory.low
> > > > > > should depend on the system's memory pressure. It seems you want to
> > > > > > allow a process to allocate more when memory pressure is high. That is
> > > > > > very counter-intuitive to me. Could you please explain the underlying
> > > > > > logic of why this is the right thing to do, without going into
> > > > > > technical details?
> > > > > What I want to achieve is make memory.low be positive correlation with
> > > > > timing and negative to memory pressure, which means the protected
> > > > > memcg should lower its protection(via lower memcg.low) for helping
> > > > > system's memory pressure when it's high.
> > > >
> > > > I have to say this is still very confusing to me. The low limit is a
> > > > protection against external (e.g. global) memory pressure. Decreasing
> > > > the protection based on the external pressure sounds like it goes right
> > > > against the purpose of the knob. I can see reasons to update protection
> > > > based on refaults or other metrics from the userspace but I still do not
> > > > see how this is a good auto-magic tuning done by the kernel.
> > > >
> > > > > The concept behind is memcg's
> > > > > fault back of dropped memory is less important than system's latency
> > > > > on high memory pressure.
> > > >
> > > > Can you give some specific examples?
> > > For both of the above two comments, please refer to the latest test
> > > result in Patchv2 I have sent. I prefer to name my change as focus
> > > transfer under pressure as protected memcg is the focus when system's
> > > memory pressure is low which will reclaim from root, this is not
> > > against current design. However, when global memory pressure is high,
> > > then the focus has to be changed to the whole system, because it
> > > doesn't make sense to let the protected memcg out of everybody, it
> > > can't
> > > do anything when the system is trapped in the kernel with reclaiming work.
> > Does it make more sense if I describe the change as memcg will be
> > protect long as system pressure is under the threshold(partially
> > coherent with current design) and will sacrifice the memcg if pressure
> > is over the threshold(added change)
> 
> No, not really. For one it is still really unclear why there should be any
> difference in the semantic between global and external memory pressure
> in general. The low limit is always a protection from the external
> pressure. And what should be the actual threshold? Amount of the reclaim
> performed, effectivness of the reclaim or what?

Btw. you might want to have a look at http://lkml.kernel.org/r/20220331084151.2600229-1-yosryahmed@google.com
where a new interface to allow pro-active memory reclaim is discussed.
I think that this might turn out to be a better fit then an automagic
kernel manipulation with a low limit. It will require a user agent to
drive the reclaim though.
-- 
Michal Hocko
SUSE Labs