lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 27 Aug 2019 19:46:52 +0000
From:   Nadav Amit <namit@...are.com>
To:     Dave Hansen <dave.hansen@...el.com>
CC:     Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        the arch/x86 maintainers <x86@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>
Subject: Re: [RFC PATCH v2 2/3] x86/mm/tlb: Defer PTI flushes

> On Aug 27, 2019, at 11:28 AM, Dave Hansen <dave.hansen@...el.com> wrote:
> 
> On 8/23/19 3:52 PM, Nadav Amit wrote:
>> INVPCID is considerably slower than INVLPG of a single PTE. Using it to
>> flush the user page-tables when PTI is enabled therefore introduces
>> significant overhead.
> 
> I'm not sure this is worth all the churn, especially in the entry code.
> For large flushes (> tlb_single_page_flush_ceiling), we don't do
> INVPCIDs in the first place.

It is possible to jump from flush_tlb_func() into the trampoline page,
instead of flushing the TLB in the entry code. However, it induces higher
overhead (switching CR3s), so it will only be useful if multiple TLB entries
are flushed at once. It also prevents exploiting opportunities of promoting
individual entry flushes into a full-TLB flush when multiple flushes are
issued or when context switch takes place before returning-to-user-space.

There are cases/workloads that flush multiple (but not too many) TLB entries
on every syscall, for instance issuing msync() or running Apache webserver.
So I am not sure that tlb_single_page_flush_ceiling saves the day. Besides,
you may want to recalibrate (lower) tlb_single_page_flush_ceiling when PTI
is used.

> I'd really want to understand what the heck is going on that makes
> INVPCID so slow, first.

INVPCID-single is slow (even more than 133 cycles slower than INVLPG that
you mentioned; I don’t have the numbers if front of me). I thought that this
is a known fact, although, obviously, it does not make much sense.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ