lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 23 Dec 2015 15:57:27 +0900
From:	Joonsoo Kim <iamjoonsoo.kim@....com>
To:	Vlastimil Babka <vbabka@...e.cz>
Cc:	David Rientjes <rientjes@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Aaron Lu <aaron.lu@...el.com>, Mel Gorman <mgorman@...e.de>,
	Rik van Riel <riel@...hat.com>, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org
Subject: Re: [PATCH 2/2] mm/compaction: speed up pageblock_pfn_to_page() when
 zone is contiguous

On Wed, Dec 23, 2015 at 07:14:21AM +0100, Vlastimil Babka wrote:
> On 22.12.2015 23:17, David Rientjes wrote:
> > On Mon, 21 Dec 2015, Joonsoo Kim wrote:
> > 
> >> Before vs After
> >> Max: 1096 MB/s vs 1325 MB/s
> >> Min: 635 MB/s 1015 MB/s
> >> Avg: 899 MB/s 1194 MB/s
> >>
> >> Avg is improved by roughly 30% [2].
> >>
> > 
> > Wow, ok!
> > 
> > I'm wondering if it would be better to maintain this as a characteristic 
> > of each pageblock rather than each zone.  Have you tried to introduce a 
> > couple new bits to pageblock_bits that would track (1) if a cached value 
> > makes sense and (2) if the pageblock is contiguous?  On the first call to 
> > pageblock_pfn_to_page(), set the first bit, PB_cached, and set the second 
> > bit, PB_contiguous, iff it is contiguous.  On subsequent calls, if 
> > PB_cached is true, then return PB_contiguous.  On memory hot-add or 
> > remove (or init), clear PB_cached.
> 
> I can imagine these bitmap operation to be as expensive as what
> __pageblock_pfn_to_page() does (or close)? But if not, we could also just be a
> bit smarter about PG_skip and check that before doing pfn_to_page.

Although I don't think carefully, to get PB_xxx, we need to check pfn's zone
and it requires pfn_valid() and pfn_to_page(). So, I guess that cost would be
same or half compared to cost of __pageblock_pfn_to_page().

> 
> > What are the cases where pageblock_pfn_to_page() is used for a subset of 
> > the pageblock and the result would be problematic for compaction?  I.e., 
> > do we actually care to use pageblocks that are not contiguous at all?
> 
> The problematic pageblocks are those that have pages from more than one zone in
> them, so we just skip them. Supposedly that can only happen by switching once
> between two zones somewhere in the middle of the pageblock, so it's sufficient
> to check first and last pfn and compare their zones. So using
> pageblock_pfn_to_page() on a subset from compaction would be wrong. Holes (==no
> pages) within pageblock is a different thing checked by pfn_valid_within()
> (#defined out on archs where such holes cannot happen) when scanning the block.
> 
> That's why I'm not entirely happy with how the patch conflates both the
> first/last pfn's zone checks and pfn_valid_within() checks. Yes, a fully
> contiguous zone does *imply* that pageblock_pfn_to_page() doesn't have to check
> first/last pfn for a matching zone. But it's not *equality*. And any (now just
> *potential*) user of pageblock_pfn_to_page() with pfn's different than
> first/last pfn of a pageblock is likely wrong.

Now, I understand your concern. What makes me mislead is that
3 of 4 callers to pageblock_pfn_to_page() in compaction.c could call it with
non-pageblock boundary pfn. Maybe, they should be fixed first. Then, yes. I can
separate first/last pfn's zone checks and pfn_valid_within() checks.
If then, would you be entirely happy? :)

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ