linux-kernel - Re: [PATCH V2] mm: vmscan: skip the file folios in proactive reclaim if swappiness is MAX

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250314165739.GB1316033@cmpxchg.org>
Date: Fri, 14 Mar 2025 12:57:39 -0400
From: Johannes Weiner <hannes@...xchg.org>
To: Michal Hocko <mhocko@...e.com>
Cc: Zhongkun He <hezhongkun.hzk@...edance.com>, akpm@...ux-foundation.org,
	muchun.song@...ux.dev, yosry.ahmed@...ux.dev, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH V2] mm: vmscan: skip the file folios in proactive reclaim
 if swappiness is MAX

On Fri, Mar 14, 2025 at 03:49:30PM +0100, Michal Hocko wrote:
> On Fri 14-03-25 10:18:33, Johannes Weiner wrote:
> > On Fri, Mar 14, 2025 at 10:27:57AM +0100, Michal Hocko wrote:
> [...]
> > > I have just noticed that you have followed up [1] with a concern that
> > > using swappiness in the whole min-max range without any heuristics turns
> > > out to be harder than just relying on the min and max as extremes.
> > > What seems to be still missing (or maybe it is just me not seeing that)
> > > is why should we only enforce those extreme ends of the range and still
> > > preserve under-defined semantic for all other swappiness values in the
> > > pro-active reclaim.
> > 
> > I'm guess I'm not seeing the "under-defined" part.
> 
> What I meant here is that any other value than both ends of swappiness
> doesn't have generally predictable behavior unless you know specific
> details of the current memory reclaim heuristics in get_scan_count.
> 
> > cache_trim_mode is
> > there to make sure a streaming file access pattern doesn't cause
> > swapping.
> 
> Yes, I am aware of the purpose.
> 
> > He has a special usecase to override cache_trim_mode when he
> > knows a large amount of anon is going cold. There is no way we can
> > generally remove it from proactive reclaim.
> 
> I believe I do understand the requirement here. The patch offers
> counterpart to noswap pro-active reclaim and I do not have objections to
> that.
> 
> The reason I brought this up is that everything in between 0..200 is
> kinda gray area. We've had several queries why swappiness=N doesn't work
> as expected and the usual answer was because of heuristics. Most people
> just learned to live with that and stopped fine tuning vm_swappiness.
> Which is good I guess.

You're still oversimplifying and then dismissing. The heuristics don't
make swappiness meaningless, they make it useful in the first place.

  This control is used to define the rough relative IO cost of swapping
  and filesystem paging, as a value between 0 and 200.

This is clearly defined, and implemented as such. cache_trim_mode is
predicated on the *absence* of paging and caching benefits: A linear,
use-once file access pattern that *does not* benefit from additional
cache space. Kicking out anon for that purpose would be wrong under
pretty much any circumstance. That's why it "overrides" swappiness:
swappiness cannot apply when swapping at all would be nonsense.

Proactive reclaimers like ours rely on this. We use swappiness to
express exactly what it says on the tin: the relative cost between
thrashing file vs anon. We use it quite effectively to manage anon
write rates for flash wear management e.g. Obviously that doesn't mean
we want to swap when somebody streams through a large file set.

Zhongkun's case is a significant exception. He just wants to get rid
of known-cold anon set. This level of insight into userspace access
patterns is rare in practice. You could argue that MADV_PAGEOUT might
be more suitable for that. But I also don't necessarily see a problem
with making swappiness=200 do it; although we might have to teach our
proactive reclaimer to auto-tune between 1 and 199 then.

> Pro-active reclaim is slightly different in a sense that it gives a much
> better control on how much to reclaim and since we have addes swappiness
> extension then even the balancing. So why not make that balancing work
> for real and always follow the given proportion? To prevent any
> unintended regressions this would be the case only with swappiness was
> explicitly given to the reclaim request. Does that make any sense?

That would require the proactive reclaimer always knowing enough about
the access patterns to implement cache_trim_mode manually. This isn't
practical. And removing the heuristics would be a massive regression.