lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <82DB7035-D7BE-4D79-BBC0-B271FB4BF740@vmware.com>
Date:   Fri, 31 May 2019 19:31:10 +0000
From:   Nadav Amit <namit@...are.com>
To:     Andy Lutomirski <luto@...capital.net>
CC:     Peter Zijlstra <peterz@...radead.org>,
        Andy Lutomirski <luto@...nel.org>,
        Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...el.com>,
        Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        "x86@...nel.org" <x86@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>
Subject: Re: [RFC PATCH v2 11/12] x86/mm/tlb: Use async and inline messages
 for flushing

> On May 31, 2019, at 11:44 AM, Andy Lutomirski <luto@...capital.net> wrote:
> 
> 
> 
>> On May 31, 2019, at 3:57 AM, Peter Zijlstra <peterz@...radead.org> wrote:
>> 
>>> On Thu, May 30, 2019 at 11:36:44PM -0700, Nadav Amit wrote:
>>> When we flush userspace mappings, we can defer the TLB flushes, as long
>>> the following conditions are met:
>>> 
>>> 1. No tables are freed, since otherwise speculative page walks might
>>>  cause machine-checks.
>>> 
>>> 2. No one would access userspace before flush takes place. Specifically,
>>>  NMI handlers and kprobes would avoid accessing userspace.
>>> 
>>> Use the new SMP support to execute remote function calls with inlined
>>> data for the matter. The function remote TLB flushing function would be
>>> executed asynchronously and the local CPU would continue execution as
>>> soon as the IPI was delivered, before the function was actually
>>> executed. Since tlb_flush_info is copied, there is no risk it would
>>> change before the TLB flush is actually executed.
>>> 
>>> Change nmi_uaccess_okay() to check whether a remote TLB flush is
>>> currently in progress on this CPU by checking whether the asynchronously
>>> called function is the remote TLB flushing function. The current
>>> implementation disallows access in such cases, but it is also possible
>>> to flush the entire TLB in such case and allow access.
>> 
>> ARGGH, brain hurt. I'm not sure I fully understand this one. How is it
>> different from today, where the NMI can hit in the middle of the TLB
>> invalidation?
>> 
>> Also; since we're not waiting on the IPI, what prevents us from freeing
>> the user pages before the remote CPU is 'done' with them? Currently the
>> synchronous IPI is like a sync point where we *know* the remote CPU is
>> completely done accessing the page.
>> 
>> Where getting an IPI stops speculation, speculation again restarts
>> inside the interrupt handler, and until we've passed the INVLPG/MOV CR3,
>> speculation can happen on that TLB entry, even though we've already
>> freed and re-used the user-page.
>> 
>> Also, what happens if the TLB invalidation IPI is stuck behind another
>> smp_function_call IPI that is doing user-access?
>> 
>> As said,.. brain hurts.
> 
> Speculation aside, any code doing dirty tracking needs the flush to happen
> for real before it reads the dirty bit.
> 
> How does this patch guarantee that the flush is really done before someone
> depends on it?

I was always under the impression that the dirty-bit is pass-through - the
A/D-assist walks the tables and sets the dirty bit upon access. Otherwise,
what happens when you invalidate the PTE, and have already marked the PTE as
non-present? Would the CPU set the dirty-bit at this point?

In this regard, I remember this thread of Dave Hansen [1], which also seems
to me as supporting the notion the dirty-bit is set on write and not on
INVLPG.

Looking at the SDM on "Virtual TLB Scheme”, also seems to support this
claim. The guest dirty bit is set in section 32.3.5.2 "Response to Page
Faults”, and not in section 32.3.5. "Response to Uses of INVLPG”.

Am I wrong?


[1] https://groups.google.com/forum/#!topic/linux.kernel/HBgh0uT24K8

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ