linux-kernel - Re: [RFC PATCH 00/17] virtual-bus

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49D3F31F.8010705@novell.com>
Date:	Wed, 01 Apr 2009 19:05:03 -0400
From:	Gregory Haskins <ghaskins@...ell.com>
To:	Andi Kleen <andi@...stfloor.org>
CC:	linux-kernel@...r.kernel.org, agraf@...e.de, pmullaney@...ell.com,
	pmorreale@...ell.com, anthony@...emonkey.ws, rusty@...tcorp.com.au,
	netdev@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [RFC PATCH 00/17] virtual-bus

Andi Kleen wrote:
> On Wed, Apr 01, 2009 at 04:29:57PM -0400, Gregory Haskins wrote:
>   
>>> description?
>>>   
>>>       
>> Yes, good point.  I will be sure to be more explicit in the next rev.
>>
>>     
>>>   
>>>       
>>>> So the administrator can then set these attributes as
>>>> desired to manipulate the configuration of the instance of the device,
>>>> on a per device basis.
>>>>     
>>>>         
>>> How would the guest learn of any changes in there?
>>>   
>>>       
>> The only events explicitly supported by the infrastructure of this
>> nature would be device-add and device-remove.  So when an admin adds or
>> removes a device to a bus, the guest would see driver::probe() and
>> driver::remove() callbacks, respectively.  All other events are left (by
>> design) to be handled by the device ABI itself, presumably over the
>> provided shm infrastructure.
>>     
>
> Ok so you rely on a transaction model where everything is set up
> before it is somehow comitted to the guest? I hope that is made
> explicit in the interface somehow.
>   
Well, its not an explicit transaction model, but I guess you could think
of it that way.

Generally you set the device up before you launch the guest.  By the
time the guest loads and tries to scan the bus for the initial
discovery, all the devices would be ready to go.

This does bring up the question of hotswap.  Today we fully support
hotswap in and out, but leaving this "enabled" transaction to the
individual device means that the device-id would be visible in the bus
namespace before the device may want to actually communicate.  Hmmm

Perhaps I need to build this in as a more explicit "enabled"
feature...and the guest will not see the driver::probe() until this happens.

>   
>> This script creates two buses ("client-bus" and "server-bus"),
>> instantiates a single venet-tap on each of them, and then "wires" them
>> together with a private bridge instance called "vbus-br0".  To complete
>> the picture here, you would want to launch two kvms, one of each of the
>> client-bus/server-bus instances.  You can do this via /proc/$pid/vbus.  E.g.
>>
>> # (echo client-bus > /proc/self/vbus; qemu-kvm -hda client.img....)
>> # (echo server-bus > /proc/self/vbus; qemu-kvm -hda server.img....)
>>
>> (And as noted, someday qemu will be able to do all the setup that the
>> script did, natively.  It would wire whatever tap it created to an
>> existing bridge with qemu-ifup, just like we do for tun-taps today)
>>     
>
> The usual problem with that is permissions. Just making qemu-ifup suid
> it not very nice.  It would be good if any new design addressed this.
>   

Well, its kind of out of my control.  venet-tap ultimately creates a
simple netif interface which we must do something with.  Once its
created, "wiring" it up to something like a linux-bridge is no different
than something like a tun-tap, so the qemu-ifup requirement doesn't change.

The one thing I can think of is it would be possible to build a
"venet-switch" module, and this could be done without using brctl or
qemu-ifup...but then I would lose all the benefits of re-using that
infrastructure.  I do not recommend we actually do this, but it would
technically be a way to address your concern.


>   
>> the current code doesnt support rw on the mac attributes yet..i need a
>> parser first).
>>     
>
> parser in kernel space always sounds scary to me.
>   
Heh..why do you think I keep procrastinating ;)

>
>   
>> Yeah, ultimately I would love to be able to support a fairly wide range
>> of the normal userspace/kernel ABI through this mechanism.  In fact, one
>> of my original design goals was to somehow expose the syscall ABI
>> directly via some kind of syscall proxy device on the bus.  I have since
>>     
>
> That sounds really scary for security. 
>
>
>   
>> backed away from that idea once I started thinking about things some
>> more and realized that a significant number of system calls are really
>> inappropriate for a guest type environment due to their ability to
>> block.   We really dont want a vcpu to block.....however, the AIO type
>>     
>
> Not only because of blocking, but also because of security issues.
> After all one of the usual reasons to run a guest is security isolation.
>   
Oh yeah, totally agreed.  Not that I am advocating this, because I have
abandoned the idea.  But back when I was thinking of this, I would have
addressed the security with the vbus and syscall-proxy-device objects
themselves.  E.g. if you dont instantiate a syscall-proxy-device on the
bus, the guest wouldnt have access to syscalls at all.   And you could
put filters into the module to limit what syscalls were allowed, which
UID to make the guest appear as, etc.

> In general the more powerful the guest API the more risky it is, so some
> self moderation is probably a good thing.
>   
:)

-Greg


Download attachment "signature.asc" of type "application/pgp-signature" (258 bytes)