[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4AB151D7.10402@redhat.com>
Date: Thu, 17 Sep 2009 00:00:07 +0300
From: Avi Kivity <avi@...hat.com>
To: Gregory Haskins <gregory.haskins@...il.com>
CC: "Michael S. Tsirkin" <mst@...hat.com>,
"Ira W. Snyder" <iws@...o.caltech.edu>, netdev@...r.kernel.org,
virtualization@...ts.linux-foundation.org, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, mingo@...e.hu, linux-mm@...ck.org,
akpm@...ux-foundation.org, hpa@...or.com,
Rusty Russell <rusty@...tcorp.com.au>, s.hetze@...ux-ag.com,
alacrityvm-devel@...ts.sourceforge.net
Subject: Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
On 09/16/2009 10:22 PM, Gregory Haskins wrote:
> Avi Kivity wrote:
>
>> On 09/16/2009 05:10 PM, Gregory Haskins wrote:
>>
>>>> If kvm can do it, others can.
>>>>
>>>>
>>> The problem is that you seem to either hand-wave over details like this,
>>> or you give details that are pretty much exactly what vbus does already.
>>> My point is that I've already sat down and thought about these issues
>>> and solved them in a freely available GPL'ed software package.
>>>
>>>
>> In the kernel. IMO that's the wrong place for it.
>>
> 3) "in-kernel": You can do something like virtio-net to vhost to
> potentially meet some of the requirements, but not all.
>
> In order to fully meet (3), you would need to do some of that stuff you
> mentioned in the last reply with muxing device-nr/reg-nr. In addition,
> we need to have a facility for mapping eventfds and establishing a
> signaling mechanism (like PIO+qid), etc. KVM does this with
> IRQFD/IOEVENTFD, but we dont have KVM in this case so it needs to be
> invented.
>
irqfd/eventfd is the abstraction layer, it doesn't need to be reabstracted.
> To meet performance, this stuff has to be in kernel and there has to be
> a way to manage it.
and management belongs in userspace.
> Since vbus was designed to do exactly that, this is
> what I would advocate. You could also reinvent these concepts and put
> your own mux and mapping code in place, in addition to all the other
> stuff that vbus does. But I am not clear why anyone would want to.
>
Maybe they like their backward compatibility and Windows support.
> So no, the kernel is not the wrong place for it. Its the _only_ place
> for it. Otherwise, just use (1) and be done with it.
>
>
I'm talking about the config stuff, not the data path.
>> Further, if we adopt
>> vbus, if drop compatibility with existing guests or have to support both
>> vbus and virtio-pci.
>>
> We already need to support both (at least to support Ira). virtio-pci
> doesn't work here. Something else (vbus, or vbus-like) is needed.
>
virtio-ira.
>>> So the question is: is your position that vbus is all wrong and you wish
>>> to create a new bus-like thing to solve the problem?
>>>
>> I don't intend to create anything new, I am satisfied with virtio. If
>> it works for Ira, excellent. If not, too bad.
>>
> I think that about sums it up, then.
>
Yes. I'm all for reusing virtio, but I'm not going switch to vbus or
support both for this esoteric use case.
>>> If so, how is it
>>> different from what Ive already done? More importantly, what specific
>>> objections do you have to what Ive done, as perhaps they can be fixed
>>> instead of starting over?
>>>
>>>
>> The two biggest objections are:
>> - the host side is in the kernel
>>
> As it needs to be.
>
vhost-net somehow manages to work without the config stuff in the kernel.
> With all due respect, based on all of your comments in aggregate I
> really do not think you are truly grasping what I am actually building here.
>
Thanks.
>>> Bingo. So now its a question of do you want to write this layer from
>>> scratch, or re-use my framework.
>>>
>>>
>> You will have to implement a connector or whatever for vbus as well.
>> vbus has more layers so it's probably smaller for vbus.
>>
> Bingo!
(addictive, isn't it)
> That is precisely the point.
>
> All the stuff for how to map eventfds, handle signal mitigation, demux
> device/function pointers, isolation, etc, are built in. All the
> connector has to do is transport the 4-6 verbs and provide a memory
> mapping/copy function, and the rest is reusable. The device models
> would then work in all environments unmodified, and likewise the
> connectors could use all device-models unmodified.
>
Well, virtio has a similar abstraction on the guest side. The host side
abstraction is limited to signalling since all configuration is in
userspace. vhost-net ought to work for lguest and s390 without change.
>> It was already implemented three times for virtio, so apparently that's
>> extensible too.
>>
> And to my point, I'm trying to commoditize as much of that process as
> possible on both the front and backends (at least for cases where
> performance matters) so that you don't need to reinvent the wheel for
> each one.
>
Since you're interested in any-to-any connectors it makes sense to you.
I'm only interested in kvm-host-to-kvm-guest, so reducing the already
minor effort to implement a new virtio binding has little appeal to me.
>> You mean, if the x86 board was able to access the disks and dma into the
>> ppb boards memory? You'd run vhost-blk on x86 and virtio-net on ppc.
>>
> But as we discussed, vhost doesn't work well if you try to run it on the
> x86 side due to its assumptions about pagable "guest" memory, right? So
> is that even an option? And even still, you would still need to solve
> the aggregation problem so that multiple devices can coexist.
>
I don't know. Maybe it can be made to work and maybe it cannot. It
probably can with some determined hacking.
--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists