linux-kernel - Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+55aFxMenPUpkFCKKmYpa-4RQnr6BvwD685W_bVT9da3ibFLg@mail.gmail.com>
Date:   Fri, 2 Dec 2016 15:09:57 -0800
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Andy Lutomirski <luto@...capital.net>
Cc:     Borislav Petkov <bp@...en8.de>, Borislav Petkov <bp@...nel.org>,
        Andy Lutomirski <luto@...nel.org>, Peter Anvin <hpa@...or.com>,
        "the arch/x86 maintainers" <x86@...nel.org>,
        One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Brian Gerst <brgerst@...il.com>,
        Matthew Whitehead <tedheadster@...il.com>,
        Henrique de Moraes Holschuh <hmh@....eng.br>,
        Peter Zijlstra <peterz@...radead.org>,
        Andrew Cooper <andrew.cooper3@...rix.com>
Subject: Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation

On Fri, Dec 2, 2016 at 2:55 PM, Andy Lutomirski <luto@...capital.net> wrote:
>>
>> Honestly, I think Intel should clean up their documentation.
>
> I'm not sure I follow.  If a user program gets migrated, it might end
> up doing cross-modification when it expects self-modification.  If
> that trips the program up, is that a user bug or a kernel bug?

Well, the user may not see it as a cross-modification.

Example: user compiles a program, and writes out the new binary. That
write goes to the page cache.

The user then immediately executes that program.

It's technically a "cross modification", because the build that wrote
the page cache ran on one CPU, and then it gets loaded on another.

Not, page faulting the binary does bring in a known serializing
instruction: iret.

But let's theorize that we have your "optimistic sysret" return path
because sometimes it can happen. So the "iret" isn't exactly
fundamental.

But we know we will write %cr2, which is a serializing instruction.

But that's not fundamental either, because we <i>could</i> just have a
program just load the object file into its own address space using the
dynamic linker. And if you don't unmap anything, there won't be any
TLB flushes.

Now, that is safe <i>too</I>, but by then we're not even relying on
simply the fact that the code couldn't even have been in any virtual
caches in the running environment, so it _must_ have come from the
physically indexed data cache. So no actual serializing instruction
even _needed_.

So there is no room for any cache I$ coherency issue at any point, but
note how we got to the point where we're now basically depending on
some fairly fundamental logic that is not in the Intel documentation?

THAT is what I don't like. I don't doubt for a moment that what we're
doing is entirely coherent, and we're fine. But the intel memory
ordering documentation simply doesn't cover this situation at all. The
"real" memory ordering documentation only covers the normal data
cache. And then they handwave the "self-modifying code" situation with
incomplete examples and just bullshit "you need a serializing
instruction", which clearly isn't actually the case, and is also
something that we very much don't even do.

It would be better if here was actual documentation, and we had some
nice clear "yeah, we don't need any stinking serializing instructions,
because we're already doing X".

              Linus