[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFzb3r8HjjnwKTbya-c7TnS3rK9MU9_zaYdA5Zs9MP7jzw@mail.gmail.com>
Date: Fri, 2 Dec 2016 09:53:52 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Andy Lutomirski <luto@...nel.org>
Cc: Peter Anvin <hpa@...or.com>,
"the arch/x86 maintainers" <x86@...nel.org>,
One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>,
Borislav Petkov <bp@...en8.de>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Brian Gerst <brgerst@...il.com>,
Matthew Whitehead <tedheadster@...il.com>,
Henrique de Moraes Holschuh <hmh@....eng.br>,
Peter Zijlstra <peterz@...radead.org>,
Andrew Cooper <andrew.cooper3@...rix.com>
Subject: Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation
On Fri, Dec 2, 2016 at 9:38 AM, Andy Lutomirski <luto@...nel.org> wrote:
>
> apply_alternatives, unfortunately. It's performance-critical because
> it's intensely stupid and does sync_core() for every single patch.
> Fixing that would be nice, too.
So looking at text_poke_early(), that's very much a case that really
shouldn't need any "sync_core()" at all as far as I can tell.
Only the current CPU is running, and for local CPU I$ coherence all
you need is a jump instruction, and even that is only on really old
CPU's. From the PPro onwards (maybe even Pentium?) the I$ is entirely
serialized as long as you change the data using the same linear
address.
So at most, that function could mark itsel f"noinline" just to
guarantee that it will cause a control flow change before returning.
The sync_core() seems entirely bogus.
Same goes for optimize_nops() too.
Linus
Powered by blists - more mailing lists