lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100302100402.GH3852@csn.ul.ie>
Date:	Tue, 2 Mar 2010 10:04:02 +0000
From:	Mel Gorman <mel@....ul.ie>
To:	Nick Piggin <npiggin@...e.de>
Cc:	Christian Ehrhardt <ehrhardt@...ux.vnet.ibm.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	epasch@...ibm.com, SCHILLIG@...ibm.com,
	Martin Schwidefsky <schwidefsky@...ibm.com>,
	Heiko Carstens <heiko.carstens@...ibm.com>,
	christof.schmitt@...ibm.com, thoss@...ibm.com, hare@...e.de,
	gregkh@...ell.com
Subject: Re: Performance regression in scsi sequential throughput (iozone)
	due to "e084b - page-allocator: preserve PFN ordering when
	__GFP_COLD is set"

On Tue, Mar 02, 2010 at 05:52:25PM +1100, Nick Piggin wrote:
> On Fri, Feb 19, 2010 at 03:19:34PM +0000, Mel Gorman wrote:
> > On Fri, Feb 19, 2010 at 12:19:27PM +0100, Christian Ehrhardt wrote:
> > > Eventually it might come down to a discussion of allocation priorities and
> > > we might even keep them as is and accept this issue - I still would prefer
> > > a good second chance implementation, other page cache allocation flags or
> > > something else that explicitly solves this issue.
> > >
> > 
> > In that line, the patch that replaced congestion_wait() with a waitqueue
> > makes some sense.
> > 
> > > Mel's patch that replaces congestion_wait with a wait for the zone watermarks
> > > becoming available again is definitely a step in the right direction and
> > > should go into upstream and the long term support branches.
> > 
> > I'll need to do a number of tests before I can move that upstream but I
> > don't think it's a merge candidate. Unfortunately, I'll be offline for a
> > week starting tomorrow so I won't be able to do the testing.
> > 
> > When I get back, I'll revisit those patches with the view to pushing
> > them upstream. I hate to treat symptoms here without knowing the
> > underlying problem but this has been spinning in circles for ages with
> > little forward progress :(
> 
> The zone pressure waitqueue patch makes sense.

I've just started the rebase and considering what sort of test is best
for it.

> We may even want to make
> it more strictly FIFO (eg. check upfront if there are waiters on the
> queue before allocating a page, and if yes then add ourself to the back
> of the waitqueue).

To be really strict about this, we'd have to check in the hot-path of the
per-cpu allocator which would be undesirable. We could check further in the
slow-path but I bet it'd be very rare that the logic would be triggered. For
a process to enter the FIFO due to waiters that were not yet woken up, the
system would have to be a) under heavy memory pressure b) reclaim taking such
a long time that check_zone_pressure() is not being called in time and c)
a process exiting or otherwise freeing memory such that the watermarks are
cleared without reclaim being involved.

This seems overkill but maybe you have a simplier case in mind?

> And also possibly even look at doing the wakeups in
> the page-freeing path. Although that might start adding too much
> overhead, so it's quite possible your sloppy-but-lighter timeout
> approach is preferable.
> 

That's how I felt about it. I was going to put another check_zone_pressure()
check after a pcp drain but thought it was too expensive.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ