lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a6cd4eb712f3b9f8898e9a2e511b397e8dc397fc@linux.dev>
Date: Tue, 14 Oct 2025 12:56:06 +0000
From: "Jiayuan Chen" <jiayuan.chen@...ux.dev>
To: "Michal Hocko" <mhocko@...e.com>
Cc: linux-mm@...ck.org, "Andrew Morton" <akpm@...ux-foundation.org>, "Axel
 Rasmussen" <axelrasmussen@...gle.com>, "Yuanchu Xie"
 <yuanchu@...gle.com>, "Wei Xu" <weixugc@...gle.com>, "Johannes Weiner"
 <hannes@...xchg.org>, "David Hildenbrand" <david@...hat.com>, "Qi Zheng"
 <zhengqi.arch@...edance.com>, "Shakeel Butt" <shakeel.butt@...ux.dev>,
 "Lorenzo Stoakes" <lorenzo.stoakes@...cle.com>,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1] mm/vmscan: Add retry logic for cgroups with
 memory.low in kswapd

October 14, 2025 at 17:33, "Michal Hocko" <mhocko@...e.com mailto:mhocko@...e.com?to=%22Michal%20Hocko%22%20%3Cmhocko%40suse.com%3E > wrote:


> 
> On Tue 14-10-25 16:18:49, Jiayuan Chen wrote:
> 
> > 
> > We can set memory.low for cgroups as a soft protection limit. When the
> >  kernel cannot reclaim any pages from other cgroups, it retries reclaim
> >  while ignoring the memory.low protection of the skipped cgroups.
> >  
> >  Currently, this retry logic only works in direct reclaim path, but is
> >  missing in the kswapd asynchronous reclaim. Typically, a cgroup may
> >  contain some cold pages that could be reclaimed even when memory.low is
> >  set.
> >  
> >  This change adds retry logic to kswapd: if the first reclaim attempt fails
> >  to reclaim any pages and some cgroups were skipped due to memory.low
> >  protection, kswapd will perform a second reclaim pass ignoring memory.low
> >  restrictions.
> >  
> >  This ensures more consistent reclaim behavior between direct reclaim and
> >  kswapd. By allowing kswapd to reclaim more proactively from protected
> >  cgroups under global memory pressure, this optimization can help reduce
> >  the occurrence of direct reclaim, which is more disruptive to application
> >  performance.
> > 
> Could you describe the problem you are trying to address in more details
> please? Because your patch is significantly changing the behavior of the
> low limit. I would even go as far as say it breaks its expecations
> because low limit should provide a certain level of protection and
> your patch would allow kswapd to reclaim from those cgroups much sooner
> now. If this is really needed then we need much more detailed
> justification and also evaluation how that influences existing users.
> 


Thanks Michal, let me explain the issue I encountered:

1. When kswapd is triggered and there's no reclaimable memory (sc.nr_reclaimed == 0),
this causes kswapd_failures counter to continuously accumulate until it reaches
MAX_RECLAIM_RETRIES. This makes the kswapd thread stop running until a direct memory
reclaim is triggered.

2. We observed a phenomenon where kswapd is triggered by watermark_boost rather
than by actual memory watermarks being insufficient. For boost-triggered
reclamation, the maximum priority can only be DEF_PRIORITY - 2, making memory
reclamation more difficult compared to when priority is 1.

3. When we find that kswapd has no reclaimable memory, I think we could try to
reclaim some memory from pods protected by memory.low, similar to how direct memory
reclaim also has logic to reclaim memory protected by memory.low.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ