linux-kernel - Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrXpR7ai047pHtdQe5J+FpuFO5ekeeEqLUt1wVLopyNt_Q@mail.gmail.com>
Date:   Tue, 14 Nov 2017 08:16:09 -0800
From:   Andy Lutomirski <luto@...nel.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Andy Lutomirski <luto@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Avi Kivity <avi@...lladb.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        linux-api <linux-api@...r.kernel.org>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Andrew Hunter <ahh@...gle.com>,
        maged michael <maged.michael@...il.com>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Paul Mackerras <paulus@...ba.org>,
        Michael Ellerman <mpe@...erman.id.au>,
        Dave Watson <davejwatson@...com>,
        Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        Andrea Parri <parri.andrea@...il.com>,
        "Russell King, ARM Linux" <linux@...linux.org.uk>,
        Greg Hackmann <ghackmann@...gle.com>,
        Will Deacon <will.deacon@....com>,
        David Sehr <sehr@...gle.com>, x86 <x86@...nel.org>
Subject: Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration

On Tue, Nov 14, 2017 at 8:13 AM, Thomas Gleixner <tglx@...utronix.de> wrote:
> On Tue, 14 Nov 2017, Andy Lutomirski wrote:
>> On Tue, Nov 14, 2017 at 8:05 AM, Peter Zijlstra <peterz@...radead.org> wrote:
>> > On Tue, Nov 14, 2017 at 03:17:12PM +0000, Mathieu Desnoyers wrote:
>> >> I've tried to create a small single-threaded self-modifying loop in
>> >> user-space to trigger a trace cache or speculative execution quirk,
>> >> but I have not succeeded yet. I suspect that I would need to know
>> >> more about the internals of the processor architecture to create the
>> >> right stalls that would allow speculative execution to move further
>> >> ahead, and trigger an incoherent execution flow. Ideas on how to
>> >> trigger this would be welcome.
>> >
>> > I thought the whole problem was per definition multi-threaded.
>> >
>> > Single-threaded stuff can't get out of sync with itself; you'll always
>> > observe your own stores.
>> >
>> > And ISTR the JIT scenario being something like the JIT overwriting
>> > previously executed but supposedly no longer used code. And in this
>> > scenario you'd want to guarantee all CPUs observe the new code before
>> > jumping into it.
>> >
>> > The current approach is using mprotect(), except that on a number of
>> > platforms the TLB invalidate from that is not guaranteed to be strong
>> > enough to sync for code changes.
>> >
>> > On x86 the mprotect() should work just fine, since we broadcast IPIs for
>> > the TLB invalidate and the IRET from those will get the things synced up
>> > again (if nothing else; very likely we'll have done a MOV-CR3 which will
>> > of course also have sufficient syncness on it).
>> >
>> > But PowerPC, s390, ARM et al that do TLB invalidates without interrupts
>> > and don't guarantee their TLB invalidate sync against execution units
>> > are left broken by this scheme.
>> >
>>
>> On x86 single-thread, you can still get in trouble, I think.  Do a
>> store, get migrated, execute the stored code.  There's no actual
>> guarantee that the new CPU does a CR3 load due to laziness.
>
> The migration IPI will probably prevent that.

What guarantees that there's an IPI?  Do we never do a syscall, get
migrated during syscall processing (due to cond_resched(), for
example), and land on another CPU that just happened to already be
scheduling?

--Andy