lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 23 Feb 2018 17:49:37 -0800
From:   Dave Hansen <dave.hansen@...ux.intel.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Andrew Lutomirski <luto@...nel.org>,
        Kees Cook <keescook@...gle.com>,
        Hugh Dickins <hughd@...gle.com>,
        Jürgen Groß <jgross@...e.com>,
        the arch/x86 maintainers <x86@...nel.org>, namit@...are.com
Subject: Re: [RFC][PATCH 00/10] Use global pages with PTI

On 02/22/2018 01:52 PM, Linus Torvalds wrote:
> Side note - and this may be crazy talk - I wonder if it might make
> sense to have a mode where we allow executable read-only kernel pages
> to be marked global too (but only in the kernel mapping).

We did that accidentally, somewhere.  It causes machine checks on K8's
iirc, which is fun (52994c256df fixed it).  So, we'd need to make sure
we avoid it there, or just make it global in the user mapping too.

> Of course, maybe the performance advantage from keeping the ITLB
> entries around isn't huge, but this *may* be worth at least asking
> some Intel architects about?

I kinda doubt it's worth the trouble.  Like you said, this probably
doesn't even matter when we have PCID support.  Also, we'll usually map
all of this text with 2M pages, minus whatever hangs over into the last
2M page of text.  My laptop looks like this:

> 0xffffffff81000000-0xffffffff81c00000          12M     ro         PSE         x  pmd
> 0xffffffff81c00000-0xffffffff81c0b000          44K     ro                     x  pte

So, even if we've flushed these entries, we can get all of them back
with a single cacheline worth of PMD entries.

Just for fun, I tried a 4-core Skylake system with KPTI and nopcid and
compiled a random kernel 10 times.  I did three configs: no global, all
kernel text global + cpu_entry_area, and only cpu_entry_area + entry
text.  The delta percentages are from the Baseline.  The deltas are
measurable, but the largest bang for our buck is obviously the entry text.

			User Time	Kernel Time	Clock Elapsed
Baseline (33 GLB PTEs)	907.6	        81.6		264.7
Entry    (28 GLB PTEs)	910.9 (+0.4%)	84.0 (+2.9%)	265.2 (+0.2%)
No global( 0 GLB PTEs)  914.2 (+0.7%)	89.2 (+9.3%)	267.8 (+1.2%)

It's a single line of code to go from the "33" to "28" configuration, so
it's totally doable.  But, it means having and parsing another boot
option that confuses people and then I have to go write actual
documentation, which I detest. :)

My inclination would be to just do the "entry" stuff as global just as
this set left things and leave it at that.

I also measured frontend stalls with the toplev.py tool[1].  They show
roughly the same thing, but a bit magnified since I was only monitoring
the kernel and because in some of these cases, even if we stop being
iTLB bound we just bottleneck on something else.

I ran:

	python ~/pmu-tools/toplev.py --kernel --level 3 make -j8

And looked for the relevant ITLB misses in the output:

Baseline
> FE             Frontend_Bound:        24.33 % Slots  [  7.68%]
> FE                ITLB_Misses:         5.16 % Clocks [  7.73%]
Entry:
> FE             Frontend_Bound:        26.62 % Slots  [  7.75%]
> FE                ITLB_Misses:        12.50 % Clocks [  7.74%]
No global:
> FE             Frontend_Bound:        27.58 % Slots  [  7.65%]
> FE                ITLB_Misses:        14.74 % Clocks [  7.71%]

1. https://github.com/andikleen/pmu-tools

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ