[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A046519.30604@redhat.com>
Date: Fri, 08 May 2009 20:00:09 +0300
From: Avi Kivity <avi@...hat.com>
To: Gregory Haskins <ghaskins@...ell.com>
CC: Anthony Liguori <anthony@...emonkey.ws>,
Chris Wright <chrisw@...s-sol.org>,
Gregory Haskins <gregory.haskins@...il.com>,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [RFC PATCH 0/3] generic hypercall support
Gregory Haskins wrote:
>> Consider nested virtualization where the host (H) runs a guest (G1)
>> which is itself a hypervisor, running a guest (G2). The host exposes
>> a set of virtio (V1..Vn) devices for guest G1. Guest G1, rather than
>> creating a new virtio devices and bridging it to one of V1..Vn,
>> assigns virtio device V1 to guest G2, and prays.
>>
>> Now guest G2 issues a hypercall. Host H traps the hypercall, sees it
>> originated in G1 while in guest mode, so it injects it into G1. G1
>> examines the parameters but can't make any sense of them, so it
>> returns an error to G2.
>>
>> If this were done using mmio or pio, it would have just worked. With
>> pio, H would have reflected the pio into G1, G1 would have done the
>> conversion from G2's port number into G1's port number and reissued
>> the pio, finally trapped by H and used to issue the I/O.
>>
>
> I might be missing something, but I am not seeing the difference here.
> We have an "address" (in this case the HC-id) and a context (in this
> case G1 running in non-root mode). Whether the trap to H is a HC or a
> PIO, the context tells us that it needs to re-inject the same trap to G1
> for proper handling. So the "address" is re-injected from H to G1 as an
> emulated trap to G1s root-mode, and we continue (just like the PIO).
>
So far, so good (though in fact mmio can short-circuit G2->H directly).
> And likewise, in both cases, G1 would (should?) know what to do with
> that "address" as it relates to G2, just as it would need to know what
> the PIO address is for. Typically this would result in some kind of
> translation of that "address", but I suppose even this is completely
> arbitrary and only G1 knows for sure. E.g. it might translate from
> hypercall vector X to Y similar to your PIO example, it might completely
> change transports, or it might terminate locally (e.g. emulated device
> in G1). IOW: G2 might be using hypercalls to talk to G1, and G1 might
> be using MMIO to talk to H. I don't think it matters from a topology
> perspective (though it might from a performance perspective).
>
How can you translate a hypercall? G1's and H's hypercall mechanisms
can be completely different.
>> So the upshoot is that hypercalls for devices must not be the primary
>> method of communications; they're fine as an optimization, but we
>> should always be able to fall back on something else. We also need to
>> figure out how G1 can stop V1 from advertising hypercall support.
>>
> I agree it would be desirable to be able to control this exposure.
> However, I am not currently convinced its strictly necessary because of
> the reason you mentioned above. And also note that I am not currently
> convinced its even possible to control it.
>
> For instance, what if G1 is an old KVM, or (dare I say) a completely
> different hypervisor? You could control things like whether G1 can see
> the VMX/SVM option at a coarse level, but once you expose VMX/SVM, who
> is to say what G1 will expose to G2? G1 may very well advertise a HC
> feature bit to G2 which may allow G2 to try to make a VMCALL. How do
> you stop that?
>
I don't see any way.
If, instead of a hypercall we go through the pio hypercall route, then
it all resolves itself. G2 issues a pio hypercall, H bounces it to G1,
G1 either issues a pio or a pio hypercall depending on what the H and G1
negotiated. Of course mmio is faster in this case since it traps directly.
btw, what's the hypercall rate you're seeing? at 10K hypercalls/sec, a
0.4us difference will buy us 0.4% reduction in cpu load, so let's see
what's the potential gain here.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists