linux-kernel - Re: [PATCH -mm] mm: more likely reclaim MADV

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 21 Jul 2008 15:49:00 +1000
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	"KOSAKI Motohiro" <kosaki.motohiro@...fujitsu.com>,
	"Johannes Weiner" <hannes@...urebad.de>,
	"Rik van Riel" <riel@...hat.com>,
	"Peter Zijlstra" <peterz@...radead.org>,
	Nossum <vegard.nossum@...il.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH -mm] mm: more likely reclaim MADV_SEQUENTIAL mappings

On Monday 21 July 2008 11:48, Andrew Morton wrote:
> On Mon, 21 Jul 2008 09:09:26 +0900 "KOSAKI Motohiro" 
<kosaki.motohiro@...fujitsu.com> wrote:
> > Hi Johannes,
> >
> > > File pages accessed only once through sequential-read mappings between
> > > fault and scan time are perfect candidates for reclaim.
> > >
> > > This patch makes page_referenced() ignore these singular references and
> > > the pages stay on the inactive list where they likely fall victim to
> > > the next reclaim phase.
> > >
> > > Already activated pages are still treated normally.  If they were
> > > accessed multiple times and therefor promoted to the active list, we
> > > probably want to keep them.
> > >
> > > Benchmarks show that big (relative to the system's memory)
> > > MADV_SEQUENTIAL mappings read sequentially cause much less kernel
> > > activity.  Especially less LRU moving-around because we never activate
> > > read-once pages in the first place just to demote them again.
> > >
> > > And leaving these perfect reclaim candidates on the inactive list makes
> > > it more likely for the real working set to survive the next reclaim
> > > scan.
> >
> > looks good to me.
> > Actually, I made similar patch half year ago.
> >
> > in my experience,
> >   - page_referenced_one is performance critical point.
> >     you should test some benchmark.
> >   - its patch improved mmaped-copy performance about 5%.
> >     (Of cource, you should test in current -mm. MM code was changed
> > widely)
> >
> > So, I'm looking for your test result :)
>
> The change seems logical and I queued it for 2.6.28.
>
> But yes, testing for what-does-this-improve is good and useful, but so
> is testing for what-does-this-worsen.  How do we do that in this case?

It's OK, but as always I worry about adding "cool new bells and
whistles" to make already-bad code work a bit faster. It slows
things down. A mispredicted branch btw is about as costly as an
atomic operation.

It is already bad because: if you are doing a big streaming copy
which you know is going to blow the cache and not be used again,
then you should be unmapping behind you as you go. If you do not
do this, then page reclaim has to do the rmap walk, page table
walk, and then the (unbatched, likely IPI delivered) TLB shootdown
for every page. Not to mention churning through the LRU and
chucking other things out just to find these pages.

So what you actually should do is use direct IO, or do page
unmappings and fadvise thingies to throw out the cache.

Adding code and branches to speed up by 5% an already terribly
suboptimal microbenchmark is not very good.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/