lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5069CCF9.7040309@linux.intel.com>
Date:	Mon, 01 Oct 2012 10:03:53 -0700
From:	"H. Peter Anvin" <hpa@...ux.intel.com>
To:	Andrea Arcangeli <aarcange@...hat.com>
CC:	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
	Andi Kleen <ak@...ux.intel.com>, linux-kernel@...r.kernel.org,
	"Kirill A. Shutemov" <kirill@...temov.name>,
	Arnd Bergmann <arnd@...db.de>, Ingo Molnar <mingo@...nel.org>,
	linux-arch@...r.kernel.org
Subject: Re: [PATCH 0/3] Virtual huge zero page

On 10/01/2012 09:31 AM, Andrea Arcangeli wrote:
> On Mon, Oct 01, 2012 at 08:34:28AM -0700, H. Peter Anvin wrote:
>> On 09/29/2012 06:48 AM, Andrea Arcangeli wrote:
>>>
>>> There would be a small cache benefit here... but even then some first
>>> level caches are virtually indexed IIRC (always physically tagged to
>>> avoid the software to notice) and virtually indexed ones won't get any
>>> benefit.
>>>
>>
>> Not quite.  The virtual indexing is limited to a few bits (e.g. three
>> bits on K8); the right way to deal with that is to color the zeropage,
>> both the regular one and the virtual one (the virtual one would circle
>> through all the colors repeatedly.)
>>
>> The cache difference, therefore, is *huge*.
> 
> Kirill measured the cache benefit and it provided a 6% gain, not very
> huge but certainly significant.
> 
>> It's a performance tradeoff, and it can, and should, be measured.
> 
> I now measured the other side of the trade, by touching only one
> character every 4k page in the range to simulate a very seeking load,
> and doing so the physical huge zero page wins with a 600% margin, so
> if the cache benefit is huge for the virtual zero page, the TLB
> benefit is massive for the physical zero page.
> 
> Overall I think picking the solution that risks to regress the least
> (also compared to current status of no zero page) is the safest.
> 

Something isn't quite right about that.  If you look at your numbers:

1,049,134,961 LLC-loads
        6,222 LLC-load-misses

This is another way of saying in your benchmark the huge zero page is
parked in your LLC - using up 2 MB of your LLC, typically a significant
portion of said cache.  In a real-life application that will squeeze out
real data, but in your benchmark the system is artificially quiescent.

It is well known that microbenchmarks can be horribly misleading.  What
led to Kirill investigating huge zero page in the first place was the
fact that some applications/macrobenchmarks benefit, and I think those
are the right thing to look at.

	-hpa




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ