lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090316122505.GD30802@wotan.suse.de>
Date:	Mon, 16 Mar 2009 13:25:05 +0100
From:	Nick Piggin <npiggin@...e.de>
To:	Mel Gorman <mel@....ul.ie>
Cc:	Linux Memory Management List <linux-mm@...ck.org>,
	Pekka Enberg <penberg@...helsinki.fi>,
	Rik van Riel <riel@...hat.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Christoph Lameter <cl@...ux-foundation.org>,
	Johannes Weiner <hannes@...xchg.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Lin Ming <ming.m.lin@...el.com>,
	Zhang Yanmin <yanmin_zhang@...ux.intel.com>,
	Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH 00/35] Cleanup and optimise the page allocator V3

On Mon, Mar 16, 2009 at 12:02:17PM +0000, Mel Gorman wrote:
> On Mon, Mar 16, 2009 at 12:33:58PM +0100, Nick Piggin wrote:
> > Wheras if you defer this until the point you need a higher order
> > page, the only thing you have to work with are the pages that are
> > free *right now*.
> > 
> 
> Well, buddy always uses the smallest available page first. Even with
> deferred coalescing, it will merge up to order-5 at least. Lets say they
> could have merged up to order-10 in ordinary circumstances, they are
> still avoided for as long as possible. Granted, it might mean that an
> order-5 is split that could have been merged but it's hard to tell how
> much of a difference that makes.

But the kinds of pages *you* are interested in are order-10, right?

 
> > Your anti-frag tests probably don't stress this long term fragmentation
> > problem.
> > 
> 
> Probably not, but we have little data on long-term fragmentation other than
> anecdotal evidence that it's ok these days.

Well, I think before anti-frag there was lots of anecdotal evidence
that it's "ok", except for loads heavily using large higher order
allocations. I don't know if we'd have many systems running with
hundreds of days of uptime on such workloads post-anti-frag? 

Google might? But I don't know how long their uptimes are. I expect
we'd have a better idea in a couple more years after the next
enterprise distro release cycles with anti-frag.

 
> > Still, it's significant enough that I think it should be made
> > optional (and arguably default to on) even if it does harm higher
> > order allocations a bit.
> > 
> 
> I could make PAGE_ORDER_MERGE_ORDER a proc tunable? If it's placed as a
> read-mostly variable beside the gfp_zone table, it might even fit in the
> same cache line.

Hmm, possibly. OTOH I don't like tunables. If you don't think it will
be a problem for hugepage allocations, then I would prefer just to
leave it on and 5 by default (or even less? COSTLY_ORDER?)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ