linux-kernel - Re: [RFC PATCH v2 11/12] x86/mm/tlb: Use async and inline messages for flushing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190531105758.GO2623@hirez.programming.kicks-ass.net>
Date:   Fri, 31 May 2019 12:57:58 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Nadav Amit <namit@...are.com>
Cc:     Andy Lutomirski <luto@...nel.org>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...el.com>,
        Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>, x86@...nel.org,
        linux-kernel@...r.kernel.org,
        Dave Hansen <dave.hansen@...ux.intel.com>
Subject: Re: [RFC PATCH v2 11/12] x86/mm/tlb: Use async and inline messages
 for flushing

On Thu, May 30, 2019 at 11:36:44PM -0700, Nadav Amit wrote:
> When we flush userspace mappings, we can defer the TLB flushes, as long
> the following conditions are met:
> 
> 1. No tables are freed, since otherwise speculative page walks might
>    cause machine-checks.
> 
> 2. No one would access userspace before flush takes place. Specifically,
>    NMI handlers and kprobes would avoid accessing userspace.
> 
> Use the new SMP support to execute remote function calls with inlined
> data for the matter. The function remote TLB flushing function would be
> executed asynchronously and the local CPU would continue execution as
> soon as the IPI was delivered, before the function was actually
> executed. Since tlb_flush_info is copied, there is no risk it would
> change before the TLB flush is actually executed.
> 
> Change nmi_uaccess_okay() to check whether a remote TLB flush is
> currently in progress on this CPU by checking whether the asynchronously
> called function is the remote TLB flushing function. The current
> implementation disallows access in such cases, but it is also possible
> to flush the entire TLB in such case and allow access.

ARGGH, brain hurt. I'm not sure I fully understand this one. How is it
different from today, where the NMI can hit in the middle of the TLB
invalidation?

Also; since we're not waiting on the IPI, what prevents us from freeing
the user pages before the remote CPU is 'done' with them? Currently the
synchronous IPI is like a sync point where we *know* the remote CPU is
completely done accessing the page.

Where getting an IPI stops speculation, speculation again restarts
inside the interrupt handler, and until we've passed the INVLPG/MOV CR3,
speculation can happen on that TLB entry, even though we've already
freed and re-used the user-page.

Also, what happens if the TLB invalidation IPI is stuck behind another
smp_function_call IPI that is doing user-access?

As said,.. brain hurts.