[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <402ae08c-22d3-bec7-6649-26632c941a29@oracle.com>
Date: Fri, 2 Dec 2016 13:50:18 -0500
From: Boris Ostrovsky <boris.ostrovsky@...cle.com>
To: Andrew Cooper <andrew.cooper3@...rix.com>,
Andy Lutomirski <luto@...nel.org>, x86@...nel.org
Cc: One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>,
Borislav Petkov <bp@...en8.de>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Brian Gerst <brgerst@...il.com>,
Matthew Whitehead <tedheadster@...il.com>,
Henrique de Moraes Holschuh <hmh@....eng.br>,
Peter Zijlstra <peterz@...radead.org>,
Xen-devel List <xen-devel@...ts.xen.org>,
Juergen Gross <JGross@...e.com>
Subject: Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core()
implementation
On 12/02/2016 06:44 AM, Andrew Cooper wrote:
> On 02/12/16 00:35, Andy Lutomirski wrote:
>> On Xen PV, CPUID is likely to trap, and Xen hypercalls aren't
>> guaranteed to serialize. (Even CPUID isn't *really* guaranteed to
>> serialize on Xen PV, but, in practice, any trap it generates will
>> serialize.)
> Well, Xen will enabled CPUID Faulting wherever it can, which is
> realistically all IvyBridge hardware and newer.
>
> All hypercalls are a privilege change to cpl0. I'd hope this condition
> is serialising, but I can't actually find any documentation proving or
> disproving this.
>
>> On my laptop, CPUID(eax=1, ecx=0) is ~83ns and IRET-to-self is
>> ~110ns. But Xen PV will trap CPUID if possible, so IRET-to-self
>> should end up being a nice speedup.
>>
>> Cc: Andrew Cooper <andrew.cooper3@...rix.com>
>> Signed-off-by: Andy Lutomirski <luto@...nel.org>
> CC'ing xen-devel and the Xen maintainers in Linux.
>
> As this is the only email from this series in my inbox, I will say this
> here, but it should really be against patch 6.
>
> A write to %cr2 is apparently (http://sandpile.org/x86/coherent.htm) not
> serialising on the 486, but I don't have a manual to hand to check.
>
> ~Andrew
>
>> ---
>> arch/x86/xen/enlighten.c | 35 +++++++++++++++++++++++++++++++++++
>> 1 file changed, 35 insertions(+)
>>
>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
>> index bdd855685403..1f765b41eee7 100644
>> --- a/arch/x86/xen/enlighten.c
>> +++ b/arch/x86/xen/enlighten.c
>> @@ -311,6 +311,39 @@ static __read_mostly unsigned int cpuid_leaf1_ecx_set_mask;
>> static __read_mostly unsigned int cpuid_leaf5_ecx_val;
>> static __read_mostly unsigned int cpuid_leaf5_edx_val;
>>
>> +static void xen_sync_core(void)
>> +{
>> + register void *__sp asm(_ASM_SP);
>> +
>> +#ifdef CONFIG_X86_32
>> + asm volatile (
>> + "pushl %%ss\n\t"
>> + "pushl %%esp\n\t"
>> + "addl $4, (%%esp)\n\t"
>> + "pushfl\n\t"
>> + "pushl %%cs\n\t"
>> + "pushl $1f\n\t"
>> + "iret\n\t"
>> + "1:"
>> + : "+r" (__sp) : : "cc");
This breaks 32-bit PV guests.
Why are we pushing %ss? We are not changing privilege levels so why not
just flags, cs and eip (which, incidentally, does work)?
-boris
>> +#else
>> + unsigned long tmp;
>> +
>> + asm volatile (
>> + "movq %%ss, %0\n\t"
>> + "pushq %0\n\t"
>> + "pushq %%rsp\n\t"
>> + "addq $8, (%%rsp)\n\t"
>> + "pushfq\n\t"
>> + "movq %%cs, %0\n\t"
>> + "pushq %0\n\t"
>> + "pushq $1f\n\t"
>> + "iretq\n\t"
>> + "1:"
>> + : "=r" (tmp), "+r" (__sp) : : "cc");
>> +#endif
>> +}
>> +
>> static void xen_cpuid(unsigned int *ax, unsigned int *bx,
>> unsigned int *cx, unsigned int *dx)
>> {
>> @@ -1289,6 +1322,8 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
>>
>> .start_context_switch = paravirt_start_context_switch,
>> .end_context_switch = xen_end_context_switch,
>> +
>> + .sync_core = xen_sync_core,
>> };
>>
>> static void xen_reboot(int reason)
Powered by blists - more mailing lists