lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 22 Dec 2015 14:17:34 -0800 (PST)
From:	David Rientjes <rientjes@...gle.com>
To:	Joonsoo Kim <js1304@...il.com>
cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Vlastimil Babka <vbabka@...e.cz>,
	Aaron Lu <aaron.lu@...el.com>, Mel Gorman <mgorman@...e.de>,
	Rik van Riel <riel@...hat.com>, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, Joonsoo Kim <iamjoonsoo.kim@....com>
Subject: Re: [PATCH 2/2] mm/compaction: speed up pageblock_pfn_to_page() when
 zone is contiguous

On Mon, 21 Dec 2015, Joonsoo Kim wrote:

> There is a performance drop report due to hugepage allocation and in there
> half of cpu time are spent on pageblock_pfn_to_page() in compaction [1].
> In that workload, compaction is triggered to make hugepage but most of
> pageblocks are un-available for compaction due to pageblock type and
> skip bit so compaction usually fails. Most costly operations in this case
> is to find valid pageblock while scanning whole zone range. To check
> if pageblock is valid to compact, valid pfn within pageblock is required
> and we can obtain it by calling pageblock_pfn_to_page(). This function
> checks whether pageblock is in a single zone and return valid pfn
> if possible. Problem is that we need to check it every time before
> scanning pageblock even if we re-visit it and this turns out to
> be very expensive in this workload.
> 
> Although we have no way to skip this pageblock check in the system
> where hole exists at arbitrary position, we can use cached value for
> zone continuity and just do pfn_to_page() in the system where hole doesn't
> exist. This optimization considerably speeds up in above workload.
> 
> Before vs After
> Max: 1096 MB/s vs 1325 MB/s
> Min: 635 MB/s 1015 MB/s
> Avg: 899 MB/s 1194 MB/s
> 
> Avg is improved by roughly 30% [2].
> 

Wow, ok!

I'm wondering if it would be better to maintain this as a characteristic 
of each pageblock rather than each zone.  Have you tried to introduce a 
couple new bits to pageblock_bits that would track (1) if a cached value 
makes sense and (2) if the pageblock is contiguous?  On the first call to 
pageblock_pfn_to_page(), set the first bit, PB_cached, and set the second 
bit, PB_contiguous, iff it is contiguous.  On subsequent calls, if 
PB_cached is true, then return PB_contiguous.  On memory hot-add or 
remove (or init), clear PB_cached.

What are the cases where pageblock_pfn_to_page() is used for a subset of 
the pageblock and the result would be problematic for compaction?  I.e., 
do we actually care to use pageblocks that are not contiguous at all?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ