lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aPEGDwiA_LhuLZmX@tiehlicka>
Date: Thu, 16 Oct 2025 16:49:51 +0200
From: Michal Hocko <mhocko@...e.com>
To: Jiayuan Chen <jiayuan.chen@...ux.dev>
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
	Axel Rasmussen <axelrasmussen@...gle.com>,
	Yuanchu Xie <yuanchu@...gle.com>, Wei Xu <weixugc@...gle.com>,
	Johannes Weiner <hannes@...xchg.org>,
	David Hildenbrand <david@...hat.com>,
	Qi Zheng <zhengqi.arch@...edance.com>,
	Shakeel Butt <shakeel.butt@...ux.dev>,
	Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1] mm/vmscan: Add retry logic for cgroups with
 memory.low in kswapd

On Tue 14-10-25 12:56:06, Jiayuan Chen wrote:
> October 14, 2025 at 17:33, "Michal Hocko" <mhocko@...e.com mailto:mhocko@...e.com?to=%22Michal%20Hocko%22%20%3Cmhocko%40suse.com%3E > wrote:
> 
> 
> > 
> > On Tue 14-10-25 16:18:49, Jiayuan Chen wrote:
> > 
> > > 
> > > We can set memory.low for cgroups as a soft protection limit. When the
> > >  kernel cannot reclaim any pages from other cgroups, it retries reclaim
> > >  while ignoring the memory.low protection of the skipped cgroups.
> > >  
> > >  Currently, this retry logic only works in direct reclaim path, but is
> > >  missing in the kswapd asynchronous reclaim. Typically, a cgroup may
> > >  contain some cold pages that could be reclaimed even when memory.low is
> > >  set.
> > >  
> > >  This change adds retry logic to kswapd: if the first reclaim attempt fails
> > >  to reclaim any pages and some cgroups were skipped due to memory.low
> > >  protection, kswapd will perform a second reclaim pass ignoring memory.low
> > >  restrictions.
> > >  
> > >  This ensures more consistent reclaim behavior between direct reclaim and
> > >  kswapd. By allowing kswapd to reclaim more proactively from protected
> > >  cgroups under global memory pressure, this optimization can help reduce
> > >  the occurrence of direct reclaim, which is more disruptive to application
> > >  performance.
> > > 
> > Could you describe the problem you are trying to address in more details
> > please? Because your patch is significantly changing the behavior of the
> > low limit. I would even go as far as say it breaks its expecations
> > because low limit should provide a certain level of protection and
> > your patch would allow kswapd to reclaim from those cgroups much sooner
> > now. If this is really needed then we need much more detailed
> > justification and also evaluation how that influences existing users.
> > 
> 
> 
> Thanks Michal, let me explain the issue I encountered:
> 
> 1. When kswapd is triggered and there's no reclaimable memory (sc.nr_reclaimed == 0),
> this causes kswapd_failures counter to continuously accumulate until it reaches
> MAX_RECLAIM_RETRIES. This makes the kswapd thread stop running until a direct memory
> reclaim is triggered.

While the definition of low limit is rather vague:
        Best-effort memory protection.  If the memory usage of a
        cgroup is within its effective low boundary, the cgroup's
        memory won't be reclaimed unless there is no reclaimable
        memory available in unprotected cgroups.
        Above the effective low boundary (or
        effective min boundary if it is higher), pages are reclaimed
        proportionally to the overage, reducing reclaim pressure for
        smaller overages.
which doesn't explicitly rule out reclaim from the kswapd context but
historically we relied on the direct reclaim to detect the "no
reclaimable memory" situation as it is much easier to achieve in that
context. Also you do not really explain why backing off kswapd when all
the reclaimable memory is low limit protected is bad.

> 2. We observed a phenomenon where kswapd is triggered by watermark_boost rather
> than by actual memory watermarks being insufficient. For boost-triggered
> reclamation, the maximum priority can only be DEF_PRIORITY - 2, making memory
> reclamation more difficult compared to when priority is 1.

Do I get it right that you would like to break low limits on
watermark_boost reclaim? I am not sure I follow your priority argument.

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ