lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 4 Dec 2017 14:22:54 -0800
From:   Andy Lutomirski <luto@...nel.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Andy Lutomirsky <luto@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Borislav Petkov <bpetkov@...e.de>,
        Greg KH <gregkh@...uxfoundation.org>,
        Kees Cook <keescook@...gle.com>,
        Hugh Dickins <hughd@...gle.com>,
        Brian Gerst <brgerst@...il.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Denys Vlasenko <dvlasenk@...hat.com>,
        Rik van Riel <riel@...hat.com>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Juergen Gross <jgross@...e.com>,
        David Laight <David.Laight@...lab.com>,
        Eduardo Valentin <eduval@...zon.com>, aliguori@...zon.com,
        Will Deacon <will.deacon@....com>,
        Daniel Gruss <daniel.gruss@...k.tugraz.at>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Ingo Molnar <mingo@...nel.org>, michael.schwarz@...k.tugraz.at,
        Borislav Petkov <bp@...en8.de>, moritz.lipp@...k.tugraz.at,
        richard.fellner@...dent.tugraz.at
Subject: Re: [patch 51/60] x86/mm: Allow flushing for future ASID switches

On Mon, Dec 4, 2017 at 6:07 AM, Thomas Gleixner <tglx@...utronix.de> wrote:
> From: Dave Hansen <dave.hansen@...ux.intel.com>
>
> If changing the page tables in such a way that an invalidation of all
> contexts (aka. PCIDs / ASIDs) is required, they can be actively invalidated
> by:
>
>  1. INVPCID for each PCID (works for single pages too).
>
>  2. Load CR3 with each PCID without the NOFLUSH bit set
>
>  3. Load CR3 with the NOFLUSH bit set for each and do INVLPG for each address.
>
> But, none of these are really feasible since there are ~6 ASIDs (12 with
> KERNEL_PAGE_TABLE_ISOLATION) at the time that invalidation is required.
> Instead of actively invalidating them, invalidate the *current* context and
> also mark the cpu_tlbstate _quickly_ to indicate future invalidation to be
> required.
>
> At the next context-switch, look for this indicator
> ('invalidate_other' being set) invalidate all of the
> cpu_tlbstate.ctxs[] entries.
>
> This ensures that any future context switches will do a full flush
> of the TLB, picking up the previous changes.
>
> Signed-off-by: Dave Hansen <dave.hansen@...ux.intel.com>
> Signed-off-by: Ingo Molnar <mingo@...nel.org>
> Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> Cc: Rik van Riel <riel@...hat.com>
> Cc: Denys Vlasenko <dvlasenk@...hat.com>
> Cc: Andy Lutomirski <luto@...nel.org>
> Cc: michael.schwarz@...k.tugraz.at
> Cc: daniel.gruss@...k.tugraz.at
> Cc: Brian Gerst <brgerst@...il.com>
> Cc: Josh Poimboeuf <jpoimboe@...hat.com>
> Cc: hughd@...gle.com
> Cc: Borislav Petkov <bp@...en8.de>
> Cc: moritz.lipp@...k.tugraz.at
> Cc: keescook@...gle.com
> Cc: Linus Torvalds <torvalds@...ux-foundation.org>
> Cc: richard.fellner@...dent.tugraz.at
> Link: https://lkml.kernel.org/r/20171123003507.E8C327F5@viggo.jf.intel.com
>
> ---
>  arch/x86/include/asm/tlbflush.h |   42 ++++++++++++++++++++++++++++++----------
>  arch/x86/mm/tlb.c               |   37 +++++++++++++++++++++++++++++++++++
>  2 files changed, 69 insertions(+), 10 deletions(-)
>
> --- a/arch/x86/include/asm/tlbflush.h
> +++ b/arch/x86/include/asm/tlbflush.h
> @@ -188,6 +188,17 @@ struct tlb_state {
>         bool is_lazy;
>
>         /*
> +        * If set we changed the page tables in such a way that we
> +        * needed an invalidation of all contexts (aka. PCIDs / ASIDs).
> +        * This tells us to go invalidate all the non-loaded ctxs[]
> +        * on the next context switch.
> +        *
> +        * The current ctx was kept up-to-date as it ran and does not
> +        * need to be invalidated.
> +        */
> +       bool invalidate_other;
> +
> +       /*
>          * Access to this CR4 shadow and to H/W CR4 is protected by
>          * disabling interrupts when modifying either one.
>          */
> @@ -267,6 +278,19 @@ static inline unsigned long cr4_read_sha
>         return this_cpu_read(cpu_tlbstate.cr4);
>  }
>
> +static inline void invalidate_pcid_other(void)
> +{
> +       /*
> +        * With global pages, all of the shared kenel page tables
> +        * are set as _PAGE_GLOBAL.  We have no shared nonglobals
> +        * and nothing to do here.
> +        */
> +       if (!static_cpu_has_bug(X86_BUG_CPU_SECURE_MODE_KPTI))
> +               return;

I think I'd be more comfortable if this check were in the caller, not
here.  Shouldn't a function called invalidate_pcid_other() do what the
name says?

> +
> +       this_cpu_write(cpu_tlbstate.invalidate_other, true);

Why do we need this extra variable instead of just looping over all
other ASIDs and invalidating them?  It would be something like:

        for (i = 1; i < TLB_NR_DYN_ASIDS; i++) {
                if (i != this_cpu_read(cpu_tlbstate.loaded_mm_asid))
                       this_cpu_write(cpu_tlbstate.ctxs[i].ctx_id, 0);
        }

modulo epic whitespace damage and possible typos.

> +}
> +
>  /*
>   * Save some of cr4 feature set we're using (e.g.  Pentium 4MB
>   * enable and PPro Global page enable), so that any CPU's that boot
> @@ -341,24 +365,22 @@ static inline void __native_flush_tlb_si
>
>  static inline void __flush_tlb_all(void)
>  {
> -       if (boot_cpu_has(X86_FEATURE_PGE))
> +       if (boot_cpu_has(X86_FEATURE_PGE)) {
>                 __flush_tlb_global();
> -       else
> +       } else {
>                 __flush_tlb();
> -
> -       /*
> -        * Note: if we somehow had PCID but not PGE, then this wouldn't work --
> -        * we'd end up flushing kernel translations for the current ASID but
> -        * we might fail to flush kernel translations for other cached ASIDs.
> -        *
> -        * To avoid this issue, we force PCID off if PGE is off.
> -        */
> +       }
>  }
>
>  static inline void __flush_tlb_one(unsigned long addr)
>  {
>         count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ONE);
>         __flush_tlb_single(addr);
> +       /*
> +        * Invalidate other address spaces inaccessible to single-page
> +        * invalidation:
> +        */

Ugh.  If I'm reading this right, __flush_tlb_single() means "flush one
user address" and __flush_tlb_one() means "flush one kernel address".
That's, um, not exactly obvious.  Could this be at least commented
better?

--Andy

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ