linux-kernel - Re: [RFC PATCH 00/19] Cleanup and optimise the page allocator V2

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 3 Mar 2009 13:51:05 +0000
From:	Mel Gorman <mel@....ul.ie>
To:	Nick Piggin <npiggin@...e.de>
Cc:	Lin Ming <ming.m.lin@...el.com>,
	Pekka Enberg <penberg@...helsinki.fi>,
	Linux Memory Management List <linux-mm@...ck.org>,
	Rik van Riel <riel@...hat.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Christoph Lameter <cl@...ux-foundation.org>,
	Johannes Weiner <hannes@...xchg.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Zhang Yanmin <yanmin_zhang@...ux.intel.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...e.hu>
Subject: Re: [RFC PATCH 00/19] Cleanup and optimise the page allocator V2

On Tue, Mar 03, 2009 at 10:04:42AM +0100, Nick Piggin wrote:
> On Tue, Mar 03, 2009 at 08:25:12AM +0000, Mel Gorman wrote:
> > On Tue, Mar 03, 2009 at 05:42:40AM +0100, Nick Piggin wrote:
> > > or if some change resulted in more cross-cpu operations then it
> > > could result in worse cache efficiency.
> > > 
> > 
> > It occured to me before sleeping last night that there could be a lot
> > of cross-cpu operations taking place in the buddy allocator itself. When
> > bulk-freeing pages, we have to examine all the buddies and merge them. In
> > the case of a freshly booted system, many of the pages of interest will be
> > within the same MAX_ORDER blocks. If multiple CPUs bulk free their pages,
> > they'll bounce the struct pages between each other a lot as we are writing
> > those cache lines. However, this would be incurring with or without my patches.
> 
> Oh yes it would definitely be a factor I think.
> 

It's on the list for a second or third pass to investigate.

>  
> > > OK, but the dynamic behaviour too. Free page A, free page B, allocate page
> > > A allocate page B etc.
> > > 
> > > The hot/cold removal would be an obvious example of what I mean, although
> > > that wasn't included in this recent patchset anyway.
> > > 
> > 
> > I get your point though, I'll keep it in mind. I've gone from plain
> > "reduce the clock cycles" to "reduce the cache misses" as if OLTP is
> > sensitive to this it has to be addressed as well.
> 
> OK cool. The patchset did look pretty good for reducing clock cycles
> though, so hopefully it turns out to be something simple.
> 

I'm hoping it is. I noticed a few oddities where we use more cache than we
need to that I cleaned up. However, the strongest possibility of being a
problem is actually the patch that removes the list-search for a page of a
given migratetype in the allocation path. The fix simplifies the allocation
path but increases the complexity of the bulk-free path by quite a bit and
increases the number of cache lines that are accessed. Worse, the fix grows
the per-cpu structure from one cache line to two on x86-64 NUMA machines
which I think is significant. I'm testing that at the moment but I might
end up dropping the patch from the first pass as a result and confine
the set to "obvious" wins.


-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/