linux-kernel - Re: [PATCH] mm: skip current when memcg reclaim

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <YXAxhp9tQPv9g0XJ@dhcp22.suse.cz>
Date:   Wed, 20 Oct 2021 17:11:02 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     Zhaoyang Huang <huangzhaoyang@...il.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Zhaoyang Huang <zhaoyang.huang@...soc.com>,
        "open list:MEMORY MANAGEMENT" <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm: skip current when memcg reclaim

On Wed 20-10-21 19:45:33, Zhaoyang Huang wrote:
> On Wed, Oct 20, 2021 at 4:55 PM Michal Hocko <mhocko@...e.com> wrote:
> >
> > On Wed 20-10-21 15:33:39, Zhaoyang Huang wrote:
> > [...]
> > > Do you mean that direct reclaim should succeed for the first round
> > > reclaim within which memcg get protected by memory.low and would NOT
> > > retry by setting memcg_low_reclaim to true?
> >
> > Yes, this is the semantic of low limit protection in the upstream
> > kernel. Have a look at do_try_to_free_pages and how it sets
> > memcg_low_reclaim only if there were no pages reclaimed.
> >
> > > It is not true in android
> > > like system, where reclaim always failed and introduce lmk and even
> > > OOM.
> >
> > I am not familiar with android specific changes to the upstream reclaim
> > logic. You should be investigating why the reclaim couldn't make a
> > forward progress (aka reclaim pages) from non-protected memcgs. There
> > are tracepoints you can use (generally vmscan prefix).
> Ok, I am aware of why you get confused now. I think you are analysing
> cgroup's behaviour according to a pre-defined workload and memory
> pattern, which should work according to the design, such as processes
> within root should provide memory before protected memcg get
> reclaimed. You can refer [1] as the hierarchy, where effective
> userspace workloads locate in protect groups and have rest of
> processes be non-grouped. In fact, non-grouped ones can not provide
> enough memory as they are kernel threads and the processes with few
> pages on LRU(control logic inside). The practical scenario is groupA
> launched a high-order kmalloc and introduce reclaiming(kswapd and
> direct reclaim). As I said, non-grouped ones can not provide enough
> contiguous memory blocks which let direct reclaim quickly fail for the
> first round reclaiming. What I am trying to do is that let kswapd try
> more for the target. It is also fair if groupA,B,C are trapping in
> slow path concurrently.
> 
> [1]
> root
> |                                                       |
> |              |
> non-grouped processes             groupA    groupB  groupC

I am sorry but I still do not understand your setup. I have asked
several times for more specifics. Without that I cannot really wrap my
head around your (ever changing) statements. This is not really a
productive use of time. I am sorry but I cannot really help you much
without understanding the actual problem.
-- 
Michal Hocko
SUSE Labs