linux-kernel - Re: [PATCH v3 2/2] mm: memcg: introduce new event to trace shrink

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20231129165752.7r4o3jylbxrj7inb@CAB-WSD-L081021>
Date:   Wed, 29 Nov 2023 19:57:52 +0300
From:   Dmitry Rokosov <ddrokosov@...utedevices.com>
To:     Michal Hocko <mhocko@...e.com>, <akpm@...ux-foundation.org>
CC:     <rostedt@...dmis.org>, <mhiramat@...nel.org>, <hannes@...xchg.org>,
        <roman.gushchin@...ux.dev>, <shakeelb@...gle.com>,
        <muchun.song@...ux.dev>, <akpm@...ux-foundation.org>,
        <kernel@...rdevices.ru>, <rockosov@...il.com>,
        <cgroups@...r.kernel.org>, <linux-mm@...ck.org>,
        <linux-kernel@...r.kernel.org>, <bpf@...r.kernel.org>
Subject: Re: [PATCH v3 2/2] mm: memcg: introduce new event to trace
 shrink_memcg

On Wed, Nov 29, 2023 at 05:06:37PM +0100, Michal Hocko wrote:
> On Wed 29-11-23 18:20:57, Dmitry Rokosov wrote:
> > On Tue, Nov 28, 2023 at 10:32:50AM +0100, Michal Hocko wrote:
> > > On Mon 27-11-23 19:16:37, Dmitry Rokosov wrote:
> [...]
> > > > 2) With this approach, we will not have the ability to trace a situation
> > > > where the kernel is requesting reclaim for a specific memcg, but due to
> > > > limits issues, we are unable to run it.
> > > 
> > > I do not follow. Could you be more specific please?
> > > 
> > 
> > I'm referring to a situation where kswapd() or another kernel mm code
> > requests some reclaim pages from memcg, but memcg rejects it due to
> > limits checkers. This occurs in the shrink_node_memcgs() function.
> 
> Ohh, you mean reclaim protection
> 
> > ===
> > 		mem_cgroup_calculate_protection(target_memcg, memcg);
> > 
> > 		if (mem_cgroup_below_min(target_memcg, memcg)) {
> > 			/*
> > 			 * Hard protection.
> > 			 * If there is no reclaimable memory, OOM.
> > 			 */
> > 			continue;
> > 		} else if (mem_cgroup_below_low(target_memcg, memcg)) {
> > 			/*
> > 			 * Soft protection.
> > 			 * Respect the protection only as long as
> > 			 * there is an unprotected supply
> > 			 * of reclaimable memory from other cgroups.
> > 			 */
> > 			if (!sc->memcg_low_reclaim) {
> > 				sc->memcg_low_skipped = 1;
> > 				continue;
> > 			}
> > 			memcg_memory_event(memcg, MEMCG_LOW);
> > 		}
> > ===
> > 
> > With separate shrink begin()/end() tracepoints we can detect such
> > problem.
> 
> How? You are only reporting the number of reclaimed pages and no
> reclaimed pages could be not just because of low/min limits but
> generally because of other reasons. You would need to report also the
> number of scanned/isolated pages.
>  

>From my perspective, if memory control group (memcg) protection
restrictions occur, we can identify them by the absence of the end()
pair of begin(). Other reasons will have both tracepoints raised.

> > > > 3) LRU and SLAB shrinkers are too common places to handle memcg-related
> > > > tasks. Additionally, memcg can be disabled in the kernel configuration.
> > > 
> > > Right. This could be all hidden in the tracing code. You simply do not
> > > print memcg id when the controller is disabled. Or just simply print 0.
> > > I do not really see any major problems with that.
> > > 
> > > I would really prefer to focus on that direction rather than adding
> > > another begin/end tracepoint which overalaps with existing begin/end
> > > traces and provides much more limited information because I would bet we
> > > will have somebody complaining that mere nr_reclaimed is not sufficient.
> > 
> > Okay, I will try to prepare a new patch version with memcg printing from
> > lruvec and slab tracepoints.
> > 
> > Then Andrew should drop the previous patchsets, I suppose. Please advise
> > on the correct workflow steps here.
> 
> Andrew usually just drops the patch from his tree and it will disappaer
> from the linux-next as well.

Okay, I understand, thank you!

Andrew, could you please take a look? I am planning to prepare a new
patch version based on Michal's suggestion, so previous one should be
dropped.

-- 
Thank you,
Dmitry