[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c96373d0-c16a-4463-147c-8624ad90af61@amd.com>
Date: Mon, 9 Apr 2018 13:04:19 -0500
From: Tom Lendacky <thomas.lendacky@....com>
To: Dave Hansen <dave.hansen@...ux.intel.com>,
linux-kernel@...r.kernel.org
Cc: linux-mm@...ck.org, aarcange@...hat.com, luto@...nel.org,
torvalds@...ux-foundation.org, keescook@...gle.com,
hughd@...gle.com, jgross@...e.com, x86@...nel.org, namit@...are.com
Subject: Re: [PATCH 00/11] [v5] Use global pages with PTI
On 4/6/2018 3:55 PM, Dave Hansen wrote:
> Changes from v4
> * Fix compile error reported by Tom Lendacky
This built with CONFIG_RANDOMIZE_BASE=y, but failed to boot successfully.
I think you're missing the initialization of __default_kernel_pte_mask in
kaslr.c.
Thanks,
Tom
> * Avoid setting _PAGE_GLOBAL on non-present entries
>
> Changes from v3:
> * Fix whitespace issue noticed by willy
> * Clarify comments about X86_FEATURE_PGE checks
> * Clarify commit message around the necessity of _PAGE_GLOBAL
> filtering when CR4.PGE=0 or PGE is unsupported.
>
> Changes from v2:
>
> * Add performance numbers to changelogs
> * Fix compile error resulting from use of x86-specific
> __default_kernel_pte_mask in arch-generic mm/early_ioremap.c
> * Delay kernel text cloning until after we are done messing
> with it (patch 11).
> * Blacklist K8 explicitly from mapping all kernel text as
> global (this should never happen because K8 does not use
> pti when pti=auto, but we on the safe side). (patch 11)
>
> --
>
> The later versions of the KAISER patches (pre-PTI) allowed the
> user/kernel shared areas to be GLOBAL. The thought was that this would
> reduce the TLB overhead of keeping two copies of these mappings.
>
> During the switch over to PTI, we seem to have lost our ability to have
> GLOBAL mappings. This adds them back.
>
> To measure the benefits of this, I took a modern Atom system without
> PCIDs and ran a microbenchmark[1] (higher is better):
>
> No Global Lines (baseline ): 6077741 lseeks/sec
> 88 Global Lines (kern entry): 7528609 lseeks/sec (+23.9%)
> 94 Global Lines (all ktext ): 8433111 lseeks/sec (+38.8%)
>
> On a modern Skylake desktop with PCIDs, the benefits are tangible, but not
> huge:
>
> No Global pages (baseline): 15783951 lseeks/sec
> 28 Global pages (this set): 16054688 lseeks/sec
> +270737 lseeks/sec (+1.71%)
>
> I also double-checked with a kernel compile on the Skylake system (lower
> is better):
>
> No Global pages (baseline): 186.951 seconds time elapsed ( +- 0.35% )
> 28 Global pages (this set): 185.756 seconds time elapsed ( +- 0.09% )
> -1.195 seconds (-0.64%)
>
> 1. https://github.com/antonblanchard/will-it-scale/blob/master/tests/lseek1.c
>
> Cc: Andrea Arcangeli <aarcange@...hat.com>
> Cc: Andy Lutomirski <luto@...nel.org>
> Cc: Linus Torvalds <torvalds@...ux-foundation.org>
> Cc: Kees Cook <keescook@...gle.com>
> Cc: Hugh Dickins <hughd@...gle.com>
> Cc: Juergen Gross <jgross@...e.com>
> Cc: x86@...nel.org
> Cc: Nadav Amit <namit@...are.com>
>
Powered by blists - more mailing lists