linux-kernel - Re: [PATCH] Add MCE support to KVM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <49EC7C5F.2000006@redhat.com>
Date:	Mon, 20 Apr 2009 16:45:03 +0300
From:	Avi Kivity <avi@...hat.com>
To:	Gerd Hoffmann <kraxel@...hat.com>
CC:	Anthony Liguori <anthony@...emonkey.ws>,
	Huang Ying <ying.huang@...el.com>,
	"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Andi Kleen <andi@...stfloor.org>
Subject: Re: [PATCH] Add MCE support to KVM

Gerd Hoffmann wrote:
> On 04/20/09 14:43, Avi Kivity wrote:
>> Gerd Hoffmann wrote:
>>>> That said, I'd like to be able to emulate the Xen HVM hypercalls. 
>>>> But in
>>>> any case, they hypercall implementation has to be in the kernel,
>>>
>>> No. With Xenner the xen hypercall emulation code lives in guest
>>> address space.
>>
>> In this case the guest ring-0 code should trap the #GP, and install the
>> hypercall page (which uses sysenter/syscall?). No kvm or qemu changes
>> needed.
>
> Doesn't fly.
>
> Reason #1: In the pv-on-hvm case the guest runs on ring0.

Sure, in this case you need to trap the MSR in the kernel (or qemu).  
But the handler is no longer in the guest address space, and you do need 
to update the opcode.

Let's not confuse the two cases.

> Reason #2: Chicken-egg issue:  For the pv-on-hvm case only few,
>            simple hypercalls are needed.  The code to handle them
>            is small enougth that it can be loaded directly into the
>            hypercall page(s).

Please elaborate.  What hypercalls are so simple that an exit into the 
hypervisor is not necessary?

>>> Is there any reason to? I *think* xen does it for better scheduling
>>> latency. But with xen emulation sitting in guest address space we can
>>> schedule the guest at will anyway.
>>
>> It also improves latency within the guest itself. At least I think that
>> what was the Hyper-V spec is saying. You can interrupt the execution of
>> a long hypercall, inject and interrupt, and resume. Sort of like a
>> rep/movs instruction, which the cpu can and will interrupt.
>
> Hmm.  Needs investigation..  I'd expect the main source of latencies 
> is page table walking.  Xen works very different from kvm+xenner here ...

kvm is mostly O(1).  We need to limit rmap chains, but we're fairly 
close.  The kvm paravirt mmu calls are not O(1), but we can easily use 
continuations there (and they're disabled on newer processors anyway).

Another area that worries me is virtio notification, which can take a 
long time.  It won't be trivial, but we can make work:

- for the existing pio-to-userspace notification, add a bit that tells 
the kernel to repeat the instruction instead of continuing.  the 'outl' 
instruction is idempotent, so we can do partial work, and return to the 
kernel.
- if using hypercallfd/piofd to a pipe, we're offloading everything to 
another thread anyway, so we can return immediately
- if using hypercallfd/piofd to a kernel virtio server, it can return 0 
bytes written, indicating it needs a retry.  kvm can try to inject an 
interrupt if it sees this.


-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/