linux-kernel - Re: [patch] mm, vmscan: avoid thrashing anon lru when free + file is low

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.10.1704181402510.112481@chino.kir.corp.google.com>
Date:   Tue, 18 Apr 2017 14:32:56 -0700 (PDT)
From:   David Rientjes <rientjes@...gle.com>
To:     Minchan Kim <minchan@...nel.org>
cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [patch] mm, vmscan: avoid thrashing anon lru when free + file
 is low

On Tue, 18 Apr 2017, Minchan Kim wrote:

> > The purpose of the code that commit 623762517e23 ("revert 'mm: vmscan: do
> > not swap anon pages just because free+file is low'") reintroduces is to
> > prefer swapping anonymous memory rather than trashing the file lru.
> > 
> > If all anonymous memory is unevictable, however, this insistance on
> 
> "unevictable" means hot workingset, not (mlocked and increased refcount
> by some driver)?
> I got confused.
> 

For my purposes, it's mlocked, but I think this thrashing is possible 
anytime we fail the file lru heuristic and the evictable anon lrus are 
very small themselves.  I'll update the changelog to make this explicit.

> > Check that enough evictable anon memory is actually on this lruvec before
> > insisting on SCAN_ANON.  SWAP_CLUSTER_MAX is used as the threshold to
> > determine if only scanning anon is beneficial.
> 
> Why do you use SWAP_CLUSTER_MAX instead of (high wmark + free) like
> file-backed pages?
> As considering anonymous pages have more probability to become workingset
> because they are are mapped, IMO, more {strong or equal} condition than
> file-LRU would be better to prevent anon LRU thrashing.
> 

If the suggestion is checking
NR_ACTIVE_ANON + NR_INACTIVE_ANON > total_high_wmark pages, it would be a 
separate heurstic to address a problem that I'm not having :)  My issue is 
specifically when NR_ACTIVE_FILE + NR_INACTIVE_FILE < total_high_wmark, 
NR_ACTIVE_ANON + NR_INACTIVE_ANON is very large, but all not on this 
lruvec's evictable lrus.

This is the reason why I chose lruvec_lru_size() rather than per-node 
statistics.  The argument could also be made for the file lrus in the 
get_scan_count() heuristic that forces SCAN_ANON, but I have not met such 
an issue (yet).  I could follow-up with that change or incorporate it into 
a v2 of this patch if you'd prefer.

In other words, I want get_scan_count() to not force SCAN_ANON and 
fallback to SCAN_FRACT, absent other heuristics, if the amount of 
evictable anon is below a certain threshold for this lruvec.  I 
arbitrarily chose SWAP_CLUSTER_MAX to be conservative, but I could easily 
compare to total_high_wmark as well, although I would consider that more 
aggressive.

So we're in global reclaim, our file lrus are below thresholds, but we 
don't want to force SCAN_ANON for all lruvecs if there's not enough to 
reclaim from evictable anon.  Do you have a suggestion for how to 
implement this logic other than this patch?

> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -2186,26 +2186,31 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
> >  	 * anon pages.  Try to detect this based on file LRU size.
> 
> Please update this comment, too.
> 

Ok, I've added: "Try to detect this based on file LRU size, but do not 
limit scanning to anon if it is too small itself."