[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrVtEg256EPMbp1j8RkbaMJNtNge6-h0EoZ3HmRo6DZCLQ@mail.gmail.com>
Date: Fri, 2 Dec 2016 11:30:23 -0800
From: Andy Lutomirski <luto@...capital.net>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Borislav Petkov <bp@...en8.de>, Borislav Petkov <bp@...nel.org>,
Andy Lutomirski <luto@...nel.org>, Peter Anvin <hpa@...or.com>,
"the arch/x86 maintainers" <x86@...nel.org>,
One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Brian Gerst <brgerst@...il.com>,
Matthew Whitehead <tedheadster@...il.com>,
Henrique de Moraes Holschuh <hmh@....eng.br>,
Peter Zijlstra <peterz@...radead.org>,
Andrew Cooper <andrew.cooper3@...rix.com>
Subject: Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation
On Fri, Dec 2, 2016 at 11:24 AM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
> On Fri, Dec 2, 2016 at 11:20 AM, Borislav Petkov <bp@...en8.de> wrote:
>>
>> Something like below?
>
> The optimize-nops thing needs it too, I think.
>
> Again, this will never matter in practice (even if somebody has a i486
> s till, the prefetch window size is like 16 bytes or something), but
> from a documentation standpoint it's good.
How's this?
/*
* This function forces the icache and prefetched instruction stream to
* catch up with reality in two very specific cases:
*
* a) Text was modified using one virtual address and is about to be executed
* from the same physical page at a different virtual address.
*
* b) Text was modified on a different CPU, may subsequently be
* executed on this CPU, and you want to make sure the new version
* gets executed. This generally means you're calling this in a IPI.
*
* If you're calling this for a different reason, you're probably doing
* it wrong.
*/
static inline void native_sync_core(void) { ... }
The body will do a MOV-to-CR2 followed by jmp 1f; 1:. This sequence
should be guaranteed to flush the pipeline on any real CPU. On Xen it
will do IRET-to-self.
I suppose it could be an unconditional IRET-to-self, but that's a good
deal slower and not a whole lot simpler. Although if we start doing
it right, performance won't really matter here.
--Andy
Powered by blists - more mailing lists