linux-kernel - Re: [RFC PATCH 00/17] virtual-bus

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49D37805.1060301@novell.com>
Date:	Wed, 01 Apr 2009 10:19:49 -0400
From:	Gregory Haskins <ghaskins@...ell.com>
To:	Andi Kleen <andi@...stfloor.org>
CC:	linux-kernel@...r.kernel.org, agraf@...e.de, pmullaney@...ell.com,
	pmorreale@...ell.com, anthony@...emonkey.ws, rusty@...tcorp.com.au,
	netdev@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [RFC PATCH 00/17] virtual-bus

Andi Kleen wrote:
> On Wed, Apr 01, 2009 at 08:03:49AM -0400, Gregory Haskins wrote:
>   
>> Andi Kleen wrote:
>>     
>>> Gregory Haskins <ghaskins@...ell.com> writes:
>>>
>>> What might be useful is if you could expand a bit more on what the high level
>>> use cases for this. 
>>>
>>> Questions that come to mind and that would be good to answer:
>>>
>>> This seems to be aimed at having multiple VMs talk
>>> to each other, but not talk to the rest of the world, correct? 
>>> Is that a common use case? 
>>>   
>>>       
>> Actually we didn't design specifically for either type of environment. 
>>     
>
> But surely you must have some specific use case in mind? Something
> that it does better than the various methods that are available
> today. Or rather there must be some problem you're trying
> to solve. I'm just not sure what that problem exactly is.
>   
Performance.  We are trying to create a high performance IO infrastructure.

Ideally we would like to see things like virtual-machines have
bare-metal performance (or as close as possible) using just pure
software on commodity hardware.   The data I provided shows that
something like KVM with virtio-net does a good job on throughput even on
10GE, but the latency is several orders of magnitude slower than
bare-metal.   We are addressing this issue and others like it that are a
result of the current design of out-of-kernel emulation.
>   
>> What we *are* trying to address is making an easy way to declare virtual
>> resources directly in the kernel so that they can be accessed more
>> efficiently.  Contrast that to the way its done today, where the models
>> live in, say, qemu userspace.
>>
>> So instead of having
>> guest->host->qemu::virtio-net->tap->[iptables|bridge], you simply have
>> guest->host->[iptables|bridge].  How you make your private network (if
>>     
>
> So is the goal more performance or simplicity or what?
>   

(Answered above)

>   
>>> What would be the use cases for non networking devices?
>>>
>>> How would the interfaces to the user look like?
>>>   
>>>       
>> I am not sure if you are asking about the guests perspective or the
>> host-administators perspective.
>>     
>
> I was wondering about the host-administrators perspective.
>   
Ah, ok.  Sorry about that.  It was probably good to document that other
thing anyway, so no harm.

So about the host-administrator interface.  The whole thing is driven by
configfs, and the basics are already covered in the documentation in
patch 2, so I wont repeat it here.  Here is a reference to the file for
everyone's convenience:

http://git.kernel.org/?p=linux/kernel/git/ghaskins/vbus/linux-2.6.git;a=blob;f=Documentation/vbus.txt;h=e8a05dafaca2899d37bd4314fb0c7529c167ee0f;hb=f43949f7c340bf667e68af6e6a29552e62f59033

So a sufficiently privileged user can instantiate a new bus (e.g.
container) and devices on that bus via configfs operations.  The types
of devices available to instantiate are dictated by whatever vbus-device
modules you have loaded into your particular kernel.  The loaded modules
available are enumerated under /sys/vbus/deviceclass.

Now presumably the administrator knows what a particular module is and
how to configure it before instantiating it.  Once they instantiate it,
it will present an interface in sysfs with a set of attributes.  For
example, an instantiated venet-tap looks like this:

ghaskins@...t:~> tree /sys/vbus/devices
/sys/vbus/devices
`-- foo
    |-- class -> ../../deviceclass/venet-tap
    |-- client_mac
    |-- enabled
    |-- host_mac
    |-- ifname
    `-- interfaces
        `-- 0 -> ../../../instances/bar/devices/0


Some of these attributes, like "class" and "interfaces" are default
attributes that are filled in by the infrastructure.  Other attributes,
like "client_mac" and "enabled" are properties defined by the venet-tap
module itself.  So the administrator can then set these attributes as
desired to manipulate the configuration of the instance of the device,
on a per device basis.

So now imagine we have some kind of disk-io vbus device that is designed
to act kind of like a file-loopback device.  It might define an
attribute allowing you to specify the path to the file/block-dev that
you want it to export.

(Warning: completely fictitious "tree" output to follow ;)

ghaskins@...t:~> tree /sys/vbus/devices
/sys/vbus/devices
`-- foo
    |-- class -> ../../deviceclass/vdisk
    |-- src_path
    `-- interfaces
        `-- 0 -> ../../../instances/bar/devices/0

So the admin would instantiate this "vdisk" device and do:

'echo /path/to/my/exported/disk.dat > /sys/vbus/devices/foo/src_path'

To point the device to the file on the host that it wants to present as
a vdisk.  Any guest that has access to the particular bus that contains
this device would then see it as a standard "vdisk" ABI device (as if
there where such a thing, yet) and could talk to it using a vdisk
specific driver.

A property of a vbus is that it is inherited by children.  Today, I do
not have direct support in qemu for creating/configuring vbus devices. 
Instead what I do is I set up the vbus and devices from bash, and then
launch qemu-kvm so it inherits the bus.  Someday (soon, unless you guys
start telling me this whole idea is rubbish ;) I will add support so you
could do things like "-net nic,model=venet" and that would trigger qemu
to go out and create the container/device on its own.  TBD.

I hope this helps to clarify!
-Greg


Download attachment "signature.asc" of type "application/pgp-signature" (258 bytes)