[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1327875921.21193.11.camel@dabdike.int.hansenpartnership.com>
Date: Sun, 29 Jan 2012 16:25:21 -0600
From: James Bottomley <James.Bottomley@...senPartnership.com>
To: Rik van Riel <riel@...hat.com>
Cc: Dan Magenheimer <dan.magenheimer@...cle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Dave Hansen <dave@...ux.vnet.ibm.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Konrad Wilk <konrad.wilk@...cle.com>,
Seth Jennings <sjenning@...ux.vnet.ibm.com>,
Nitin Gupta <ngupta@...are.org>,
Nebojsa Trpkovic <trx.lists@...il.com>, minchan@...nel.org,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
Chris Mason <chris.mason@...cle.com>,
lsf-pc@...ts.linux-foundation.org
Subject: Re: [PATCH] mm: implement WasActive page flag (for improving
cleancache)
On Sat, 2012-01-28 at 19:50 -0500, Rik van Riel wrote:
> On 01/27/2012 04:49 PM, James Bottomley wrote:
>
> > So here, I was just saying your desire to store more data in the page
> > table and expand the page flags looks complex.
> >
> > Perhaps we do have a fundamental misunderstanding: For readahead, I
> > don't really care about the referenced part. referenced just means
> > pointed to by one or more vmas and active means pointed to by two or
> > more vmas (unless executable in which case it's one).
>
> That is not at all what "referenced" means everywhere
> else in the VM.
I'm aware there's more subtlety, but I think it's a reasonable
generality: your one sentence summary of page_referenced() seems
conspicuously absent; care to provide it ... or would you prefer the VM
internals remain inaccessible to mere mortals?
> If you write theories on what Dan should use, it would
> help if you limited yourself to stuff the VM provides
> and/or could provide :)
I didn't give any theories at all about what he should or shouldn't do.
I'm trying to think out loud about whether what he wants and what I
think would help readahead are the same thing (I started of thinking
they were and I talked myself out of it by the end of the previous
email).
> > What I think we care about for readahead is accessed. This means a page
> > that got touched regardless of how many references it has. An
> > unaccessed unaged RA page is a less good candidate for reclaim because
> > it should soon be accessed (under the RA heuristics) than an accessed RA
> > page. Obviously if the heuristics misfire, we end up with futile RA
> > pages, which we read in expecting to be accessed, but which in fact
> > never were (so an unaccessed aged RA page) and need to be evicted.
> >
> > But for me, perhaps it's enough to put unaccessed RA pages into the
> > active list on instantiation and then actually put them in the inactive
> > list when they're accessed
>
> That is an absolutely terrible idea for many obvious reasons.
>
> Having readahead pages displace the working set wholesale
> is the absolute last thing we want.
Um, only if you assume you place them at the most recently used head of
the active list ... for obvious reasons, that's not what I was thinking.
I'm still not sure it's more feasible than having separate lists, though
since most recently used tail is nasty because it's reverse ordering
them and probably not providing sufficient boost and middle insertion
looks just plain wrong.
> > I'm less clear on why you think a WasActive() flag is needed. I think
> > you mean a member of the inactive list that was at some point previously
> > active.
>
> > Um, that's complex. Doesn't your inactive-C list really just identify
> > pages that were shared but have sunk in the LRU lists due to lack of
> > use?
>
> Nope. Pages that are not mapped can still end up on the active
> list, by virtue of getting accessed multiple times in a "short"
> period of time (the residence on the inactive list).
>
> We want to cache frequently accessed pages with preference over
> streaming IO data that gets accessed infrequently.
Well, no, that's what I'm trying to argue against. The chances are that
Streaming RA I/O gets accessed once (the classic movie scenario). So
the idea is that if you can identify RA as streaming, it should be kept
while unaccessed but discarded after it's been accessed. To get the LRU
lists to identify this, we want to give a boost to unaccessed unaged RA,
a suppression to accessed once RA and standard heuristics if RA gets
accessed more than once.
James
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists