linux-kernel - Re: [RFC PATCH 0/3] generic hypercall support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4A046519.30604@redhat.com>
Date:	Fri, 08 May 2009 20:00:09 +0300
From:	Avi Kivity <avi@...hat.com>
To:	Gregory Haskins <ghaskins@...ell.com>
CC:	Anthony Liguori <anthony@...emonkey.ws>,
	Chris Wright <chrisw@...s-sol.org>,
	Gregory Haskins <gregory.haskins@...il.com>,
	linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [RFC PATCH 0/3] generic hypercall support

Gregory Haskins wrote:
>> Consider nested virtualization where the host (H) runs a guest (G1)
>> which is itself a hypervisor, running a guest (G2).  The host exposes
>> a set of virtio (V1..Vn) devices for guest G1.  Guest G1, rather than
>> creating a new virtio devices and bridging it to one of V1..Vn,
>> assigns virtio device V1 to guest G2, and prays.
>>
>> Now guest G2 issues a hypercall.  Host H traps the hypercall, sees it
>> originated in G1 while in guest mode, so it injects it into G1.  G1
>> examines the parameters but can't make any sense of them, so it
>> returns an error to G2.
>>
>> If this were done using mmio or pio, it would have just worked.  With
>> pio, H would have reflected the pio into G1, G1 would have done the
>> conversion from G2's port number into G1's port number and reissued
>> the pio, finally trapped by H and used to issue the I/O. 
>>     
>
> I might be missing something, but I am not seeing the difference here. 
> We have an "address" (in this case the HC-id) and a context (in this
> case G1 running in non-root mode).   Whether the  trap to H is a HC or a
> PIO, the context tells us that it needs to re-inject the same trap to G1
> for proper handling.  So the "address" is re-injected from H to G1 as an
> emulated trap to G1s root-mode, and we continue (just like the PIO).
>   

So far, so good (though in fact mmio can short-circuit G2->H directly).

> And likewise, in both cases, G1 would (should?) know what to do with
> that "address" as it relates to G2, just as it would need to know what
> the PIO address is for.  Typically this would result in some kind of
> translation of that "address", but I suppose even this is completely
> arbitrary and only G1 knows for sure.  E.g. it might translate from
> hypercall vector X to Y similar to your PIO example, it might completely
> change transports, or it might terminate locally (e.g. emulated device
> in G1).   IOW: G2 might be using hypercalls to talk to G1, and G1 might
> be using MMIO to talk to H.  I don't think it matters from a topology
> perspective (though it might from a performance perspective).
>   

How can you translate a hypercall?  G1's and H's hypercall mechanisms 
can be completely different.


>> So the upshoot is that hypercalls for devices must not be the primary
>> method of communications; they're fine as an optimization, but we
>> should always be able to fall back on something else.  We also need to
>> figure out how G1 can stop V1 from advertising hypercall support.
>>     
> I agree it would be desirable to be able to control this exposure. 
> However, I am not currently convinced its strictly necessary because of
> the reason you mentioned above.  And also note that I am not currently
> convinced its even possible to control it.
>
> For instance, what if G1 is an old KVM, or (dare I say) a completely
> different hypervisor?  You could control things like whether G1 can see
> the VMX/SVM option at a coarse level, but once you expose VMX/SVM, who
> is to say what G1 will expose to G2?  G1 may very well advertise a HC
> feature bit to G2 which may allow G2 to try to make a VMCALL.  How do
> you stop that?
>   

I don't see any way.

If, instead of a hypercall we go through the pio hypercall route, then 
it all resolves itself.  G2 issues a pio hypercall, H bounces it to G1, 
G1 either issues a pio or a pio hypercall depending on what the H and G1 
negotiated.  Of course mmio is faster in this case since it traps directly.

btw, what's the hypercall rate you're seeing? at 10K hypercalls/sec, a 
0.4us difference will buy us 0.4% reduction in cpu load, so let's see 
what's the potential gain here.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/