linux-kernel - Re: [PATCH v2] mm/vmscan: skip increasing kswapd

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <42fca12aec282a64d3b5bd471124a1e94048afc4@linux.dev>
Date: Fri, 14 Nov 2025 02:23:50 +0000
From: "Jiayuan Chen" <jiayuan.chen@...ux.dev>
To: "Shakeel Butt" <shakeel.butt@...ux.dev>, "Michal Hocko" <mhocko@...e.com>
Cc: linux-mm@...ck.org, "Andrew Morton" <akpm@...ux-foundation.org>,
 "Johannes Weiner" <hannes@...xchg.org>, "David Hildenbrand"
 <david@...hat.com>, "Qi Zheng" <zhengqi.arch@...edance.com>, "Lorenzo
 Stoakes" <lorenzo.stoakes@...cle.com>, "Axel Rasmussen"
 <axelrasmussen@...gle.com>, "Yuanchu Xie" <yuanchu@...gle.com>, "Wei Xu"
 <weixugc@...gle.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] mm/vmscan: skip increasing kswapd_failures when
 reclaim was boosted

2025/11/14 03:28, "Shakeel Butt" <shakeel.butt@...ux.dev mailto:shakeel.butt@...ux.dev?to=%22Shakeel%20Butt%22%20%3Cshakeel.butt%40linux.dev%3E > wrote:


> 
> On Thu, Nov 13, 2025 at 11:02:41AM +0100, Michal Hocko wrote:
> 
> > 
> > In general I think not incrementing the failure for boosted kswapd
> >  iteration is right. If this issue (high protection causing kswap
> >  failures) happen on non-boosted case, I am not sure what should be right
> >  behavior i.e. allocators doing direct reclaim potentially below low
> >  protection or allowing kswapd to reclaim below low. For min, it is very
> >  clear that direct reclaimer has to reclaim as they may have to trigger
> >  oom-kill. For low protection, I am not sure.
> >  
> >  Our current documention gives us some room for interpretation. I am
> >  wondering whether we need to change the existing implemnetation though.
> >  If kswapd is not able to make progress then we surely have direct
> >  reclaim happening. So I would only change this if we had examples of
> >  properly/sensibly configured systems where kswapd low limit breach could
> >  help to reuduce stalls (improve performance) while the end result from
> >  the amount of reclaimed memory would be same/very similar.
> > 
> Yes, I think any change here will need much more brainstorming and
> experimentation. There are definitely corner cases which the right
> solution might not be in kernel. One such case I was thinking about is
> unbalanced (memory) numa node where I don't think kswapd of that node
> should do anything because of the disconnect between numa memory usage
> and memcg limits. On such cases either numa balancing or
> promotion/demotion systems under discussion would be more appropriate.
> Anyways this is orthogonal.

Can I ask for a link or some keywords to search the mailing list regarding the NUMA
imbalance you mentioned? 

I'm not sure if it's similar to a problem I encountered before. We have a system
with 2 nodes and swap is disabled. After running for a while, we found that anonymous
pages occupied over 99% of one node. When kswapd on that node runs, it continuously tries
to reclaim the 1% file pages. However, these file pages are mostly code pages and are hot,
leading to frenzied refaults, which eventually causes sustained high read I/O load on the disk.