netdev - Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 15 Sep 2009 16:08:23 -0400
From:	Gregory Haskins <gregory.haskins@...il.com>
To:	Avi Kivity <avi@...hat.com>
CC:	"Michael S. Tsirkin" <mst@...hat.com>,
	"Ira W. Snyder" <iws@...o.caltech.edu>, netdev@...r.kernel.org,
	virtualization@...ts.linux-foundation.org, kvm@...r.kernel.org,
	linux-kernel@...r.kernel.org, mingo@...e.hu, linux-mm@...ck.org,
	akpm@...ux-foundation.org, hpa@...or.com,
	Rusty Russell <rusty@...tcorp.com.au>, s.hetze@...ux-ag.com,
	alacrityvm-devel@...ts.sourceforge.net
Subject: Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server

Avi Kivity wrote:
> On 09/15/2009 04:50 PM, Gregory Haskins wrote:
>>> Why?  vhost will call get_user_pages() or copy_*_user() which ought to
>>> do the right thing.
>>>      
>> I was speaking generally, not specifically to Ira's architecture.  What
>> I mean is that vbus was designed to work without assuming that the
>> memory is pageable.  There are environments in which the host is not
>> capable of mapping hvas/*page, but the memctx->copy_to/copy_from
>> paradigm could still work (think rdma, for instance).
>>    
> 
> Sure, vbus is more flexible here.
> 
>>>> As an aside: a bigger issue is that, iiuc, Ira wants more than a single
>>>> ethernet channel in his design (multiple ethernets, consoles, etc).  A
>>>> vhost solution in this environment is incomplete.
>>>>
>>>>        
>>> Why?  Instantiate as many vhost-nets as needed.
>>>      
>> a) what about non-ethernets?
>>    
> 
> There's virtio-console, virtio-blk etc.  None of these have kernel-mode
> servers, but these could be implemented if/when needed.

IIUC, Ira already needs at least ethernet and console capability.

> 
>> b) what do you suppose this protocol to aggregate the connections would
>> look like? (hint: this is what a vbus-connector does).
>>    
> 
> You mean multilink?  You expose the device as a multiqueue.

No, what I mean is how do you surface multiple ethernet and consoles to
the guests?  For Ira's case, I think he needs at minimum at least one of
each, and he mentioned possibly having two unique ethernets at one point.

His slave boards surface themselves as PCI devices to the x86
host.  So how do you use that to make multiple vhost-based devices (say
two virtio-nets, and a virtio-console) communicate across the transport?

There are multiple ways to do this, but what I am saying is that
whatever is conceived will start to look eerily like a vbus-connector,
since this is one of its primary purposes ;)


> 
>> c) how do you manage the configuration, especially on a per-board basis?
>>    
> 
> pci (for kvm/x86).

Ok, for kvm understood (and I would also add "qemu" to that mix).  But
we are talking about vhost's application in a non-kvm environment here,
right?.

So if the vhost-X devices are in the "guest", and the x86 board is just
a slave...How do you tell each ppc board how many devices and what
config (e.g. MACs, etc) to instantiate?  Do you assume that they should
all be symmetric and based on positional (e.g. slot) data?  What if you
want asymmetric configurations (if not here, perhaps in a different
environment)?

> 
>> Actually I have patches queued to allow vbus to be managed via ioctls as
>> well, per your feedback (and it solves the permissions/lifetime
>> critisims in alacrityvm-v0.1).
>>    
> 
> That will make qemu integration easier.
> 
>>>   The only difference is the implementation.  vhost-net
>>> leaves much more to userspace, that's the main difference.
>>>      
>> Also,
>>
>> *) vhost is virtio-net specific, whereas vbus is a more generic device
>> model where thing like virtio-net or venet ride on top.
>>    
> 
> I think vhost-net is separated into vhost and vhost-net.

Thats good.

> 
>> *) vhost is only designed to work with environments that look very
>> similar to a KVM guest (slot/hva translatable).  vbus can bridge various
>> environments by abstracting the key components (such as memory access).
>>    
> 
> Yes.  virtio is really virtualization oriented.

I would say that its vhost in particular that is virtualization
oriented.  virtio, as a concept, generally should work in physical
systems, if perhaps with some minor modifications.  The biggest "limit"
is having "virt" in its name ;)

> 
>> *) vhost requires an active userspace management daemon, whereas vbus
>> can be driven by transient components, like scripts (ala udev)
>>    
> 
> vhost by design leaves configuration and handshaking to userspace.  I
> see it as an advantage.

The misconception here is that vbus by design _doesn't define_ where
configuration/handshaking happens.  It is primarily implemented by a
modular component called a "vbus-connector", and _I_ see this
flexibility as an advantage.  vhost on the other hand depends on a
active userspace component and a slots/hva memory design, which is more
limiting in where it can be used and forces you to split the logic.
However, I think we both more or less agree on this point already.

For the record, vbus itself is simply a resource container for
virtual-devices, which provides abstractions for the various points of
interest to generalizing PV (memory, signals, etc) and the proper
isolation and protection guarantees.  What you do with it is defined by
the modular virtual-devices (e.g. virtion-net, venet, sched, hrt, scsi,
rdma, etc) and vbus-connectors (vbus-kvm, etc) you plug into it.

As an example, you could emulate the vhost design in vbus by writing a
"vbus-vhost" connector.  This connector would be very thin and terminate
locally in QEMU.  It would provide a ioctl-based verb namespace similar
to the existing vhost verbs we have today.  QEMU would then similarly
reflect the vbus-based virtio device as a PCI device to the guest, so
that virtio-pci works unmodified.

You would then have most of the advantages of the work I have done for
commoditizing/abstracting the key points for in-kernel PV, like the
memctx.  In addition, much of the work could be reused in multiple
environments since any vbus-compliant device model that is plugged into
the framework would work with any connector that is plugged in (e.g.
vbus-kvm (alacrityvm), vbus-vhost (KVM), and "vbus-ira").

The only tradeoff is in features offered by the connector (e.g.
vbus-vhost has the advantage that existing PV guests can continue to
work unmodified, vbus-kvm has the advantage that it supports new
features like generic shared memory, non-virtio based devices,
priortizable interrupts, no dependencies on PCI for non PCI guests, etc).

Kind Regards,
-Greg


Download attachment "signature.asc" of type "application/pgp-signature" (268 bytes)