lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d9ed02ebcf71057e42c808beb68f8eb489394750@linux.dev>
Date: Tue, 06 Jan 2026 03:14:29 +0000
From: "Jiayuan Chen" <jiayuan.chen@...ux.dev>
To: "Shakeel Butt" <shakeel.butt@...ux.dev>
Cc: linux-mm@...ck.org, "Jiayuan Chen" <jiayuan.chen@...pee.com>, "Johannes
 Weiner" <hannes@...xchg.org>, "Michal Hocko" <mhocko@...nel.org>, "Roman
 Gushchin" <roman.gushchin@...ux.dev>, "Muchun Song"
 <muchun.song@...ux.dev>, "Andrew Morton" <akpm@...ux-foundation.org>,
 "David Hildenbrand" <david@...nel.org>, "Qi Zheng"
 <zhengqi.arch@...edance.com>, "Lorenzo Stoakes"
 <lorenzo.stoakes@...cle.com>, "Axel Rasmussen"
 <axelrasmussen@...gle.com>, "Yuanchu Xie" <yuanchu@...gle.com>, "Wei Xu"
 <weixugc@...gle.com>, cgroups@...r.kernel.org,
 linux-kernel@...r.kernel.org, "Hui Zhu" <hui.zhu@...ux.dev>
Subject: Re: [PATCH v2] mm/memcg: scale memory.high penalty based on refault
 recency

January 6, 2026 at 01:08, "Shakeel Butt" <shakeel.butt@...ux.dev mailto:shakeel.butt@...ux.dev?to=%22Shakeel%20Butt%22%20%3Cshakeel.butt%40linux.dev%3E > wrote:


> 
> +Hui Zhu
> 
> Hi Jiayuan,
> 
> On Mon, Dec 29, 2025 at 11:39:55AM +0800, Jiayuan Chen wrote:
> 
> > 
> > From: Jiayuan Chen <jiayuan.chen@...pee.com>
> >  
> >  Problem
> >  -------
> >  We observed an issue in production where a workload continuously
> >  triggering memory.high also generates massive disk IO READ, causing
> >  system-wide performance degradation.
> >  
> >  This happens because memory.high penalty is currently based solely on
> >  the overage amount, not the actual impact of that overage:
> >  
> >  1. A memcg over memory.high reclaiming cold/unused pages
> >  → minimal system impact, light penalty is appropriate
> >  
> >  2. A memcg over memory.high with hot pages being continuously
> >  reclaimed and refaulted → severe IO pressure, needs heavy penalty
> >  
> >  Both cases receive identical penalties today. Users are forced to
> >  combine memory.high with io.max as a workaround, but this is:
> >  - The wrong abstraction level (memory policy shouldn't require IO tuning)
> >  - Hard to configure correctly across different storage devices
> >  - Unintuitive for users who only want memory control
> > 
> Thanks for raising and reporting this use-case. Overall I am supportive
> of making memory.high more useful but instead of adding more more
> heuristic in the kernel, I would prefer to make the enforcement of
> memory.high more flexible with BPF.
> 
> At the moment, Hui Zhu is working on adding BPF support for memcg but it
> is very generic and I would prefer to start with specific and real
> use-case. I think your use-case is real and will be beneficial to many
> other users. Can you please followup on that Hui's RFC to present your
> use-case? I will also try to push the effort from the review side.
> 
> thanks,
> Shakeel
>

Hi Shakeel,

Thanks for the feedback and pointing to Hui's RFC.

I noticed Michal has already forwarded my patch to that thread, and
Hui has responded. I'll wait to see how that discussion evolves and
whether there's an opportunity to integrate my use-case into his
BPF framework.

You're right that my timestamp-based approach is heuristic. It was
designed as a simple, low-overhead approximation to detect active
thrashing without the cost of flushing refault counters on every
charge. But I agree that a more flexible BPF-based solution could
be cleaner in the long term.

I'll follow up on Hui's thread once there's more progress.

Thanks,
Jiayuan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ