linux-kernel - Re: [PATCH] vmscan: evict use-once pages first (v3)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090503012806.GB5702@localhost>
Date:	Sun, 3 May 2009 09:28:06 +0800
From:	Wu Fengguang <fengguang.wu@...el.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Rik van Riel <riel@...hat.com>, kosaki.motohiro@...fujitsu.com,
	peterz@...radead.org, elladan@...imo.com,
	linux-kernel@...r.kernel.org, tytso@....edu, linux-mm@...ck.org
Subject: Re: [PATCH] vmscan: evict use-once pages first (v3)

On Fri, May 01, 2009 at 04:25:06PM -0700, Andrew Morton wrote:
> On Fri, 01 May 2009 19:05:21 -0400
> Rik van Riel <riel@...hat.com> wrote:
> 
> > Andrew Morton wrote:
> > > On Wed, 29 Apr 2009 13:14:36 -0400
> > > Rik van Riel <riel@...hat.com> wrote:
> > > 
> > >> When the file LRU lists are dominated by streaming IO pages,
> > >> evict those pages first, before considering evicting other
> > >> pages.
> > >>
> > >> This should be safe from deadlocks or performance problems
> > >> because only three things can happen to an inactive file page:
> > >> 1) referenced twice and promoted to the active list
> > >> 2) evicted by the pageout code
> > >> 3) under IO, after which it will get evicted or promoted
> > >>
> > >> The pages freed in this way can either be reused for streaming
> > >> IO, or allocated for something else. If the pages are used for
> > >> streaming IO, this pageout pattern continues. Otherwise, we will
> > >> fall back to the normal pageout pattern.
> > >>
> > >> ..
> > >>
> > >> +int mem_cgroup_inactive_file_is_low(struct mem_cgroup *memcg)
> > >> +{
> > >> +	unsigned long active;
> > >> +	unsigned long inactive;
> > >> +
> > >> +	inactive = mem_cgroup_get_local_zonestat(memcg, LRU_INACTIVE_FILE);
> > >> +	active = mem_cgroup_get_local_zonestat(memcg, LRU_ACTIVE_FILE);
> > >> +
> > >> +	return (active > inactive);
> > >> +}
> > > 
> > > This function could trivially be made significantly more efficient by
> > > changing it to do a single pass over all the zones of all the nodes,
> > > rather than two passes.
> > 
> > How would I do that in a clean way?
> 
> copy-n-paste :(
> 
> static unsigned long foo(struct mem_cgroup *mem,
> 			enum lru_list idx1, enum lru_list idx2)
> {
> 	int nid, zid;
> 	struct mem_cgroup_per_zone *mz;
> 	u64 total = 0;
> 
> 	for_each_online_node(nid)
> 		for (zid = 0; zid < MAX_NR_ZONES; zid++) {
> 			mz = mem_cgroup_zoneinfo(mem, nid, zid);
> 			total += MEM_CGROUP_ZSTAT(mz, idx1);
> 			total += MEM_CGROUP_ZSTAT(mz, idx3);
> 		}
> 	return total;
> }
> 
> dunno if that's justifiable.
> 
> > The function mem_cgroup_inactive_anon_is_low and
> > the global versions all do the same.  It would be
> > nice to make all four of them go fast :)
> > 
> > If there is no standardized infrastructure for
> > getting multiple statistics yet, I can probably
> > whip something up.
> 
> It depends how often it would be called for, I guess.
> 
> One approach would be pass in a variable-length array of `enum
> lru_list's, get returned a same-lengthed array of totals.
> 
> Or perhaps all we need to return is the sum of those totals.
> 
> I'd let the memcg guys worry about this if I were you ;)
> 
> > Optimizing them might make sense if it turns out to
> > use a significant amount of CPU.
> 
> Yeah.  By then it's often too late though.  The sort of people for whom
> (num_online_nodes*MAX_NR_ZONES) is nuttily large tend not to run
> kernel.org kernels.

Good point. We could add a flag that is tested frequently in shrink_list()
and updated less frequently in shrink_zone() (or whatever).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/