lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFw3P++jGbRhkrM3MyqwZZ3mxmxQTC1BfYS2rhw6P3LKNA@mail.gmail.com>
Date:   Fri, 2 Dec 2016 09:32:45 -0800
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Andy Lutomirski <luto@...nel.org>, Peter Anvin <hpa@...or.com>
Cc:     "the arch/x86 maintainers" <x86@...nel.org>,
        One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>,
        Borislav Petkov <bp@...en8.de>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Brian Gerst <brgerst@...il.com>,
        Matthew Whitehead <tedheadster@...il.com>,
        Henrique de Moraes Holschuh <hmh@....eng.br>,
        Peter Zijlstra <peterz@...radead.org>,
        Andrew Cooper <andrew.cooper3@...rix.com>
Subject: Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation

On Thu, Dec 1, 2016 at 4:35 PM, Andy Lutomirski <luto@...nel.org> wrote:
>
> On my laptop, CPUID(eax=1, ecx=0) is ~83ns and IRET-to-self is
> ~110ns.  But Xen PV will trap CPUID if possible, so IRET-to-self
> should end up being a nice speedup.

So if we care deeply about the performance of this, we should really
ask ourselves how much we need this...

There are *very* few places where we really need to do a full
serializing instruction, and I'd worry that we really don't need it in
many of the places we do this.

The only real case I'm aware of is modifying code that is modified
through a different linear address than it's executed.

Is there anything else where we _really_ need this sync-core thing?
Sure, the microcode loader looks fine, but that doesn't look
particularly performance-critical either.

So I'd like to know which sync_core is actually so
performance-critical that w e care about it, and then I'd like to
understand why it's needed at all, because I suspect a number of them
has been added with the model of "sprinkle random things around and
hope".

Looking at ftrace, for example, which is one of the users, does it
actually _really_ need sync_core() at all? It seems to use the kerrnel
virtual address, and then the instruction stream will be coherent.

Adding Peter Anvin to the participants list, because iirc he was the
one who really talked to hardwre engineers about the synchronization
issues with serializing kernel code.

                Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ