netdev - Re: [RFC PATCH 00/17] virtual-bus

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49D5F669.1070502@redhat.com>
Date:	Fri, 03 Apr 2009 14:43:37 +0300
From:	Avi Kivity <avi@...hat.com>
To:	Gregory Haskins <ghaskins@...ell.com>
CC:	Patrick Mullaney <pmullaney@...ell.com>, anthony@...emonkey.ws,
	andi@...stfloor.org, herbert@...dor.apana.org.au,
	Peter Morreale <PMorreale@...ell.com>, rusty@...tcorp.com.au,
	agraf@...e.de, kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
	netdev@...r.kernel.org
Subject: Re: [RFC PATCH 00/17] virtual-bus

Gregory Haskins wrote:
>>> Yes, but the important thing to point out is it doesn't *replace*
>>> PCI. It simply an alternative.
>>>   
>>>       
>> Does it offer substantial benefits over PCI?  If not, it's just extra
>> code.
>>     
>
> First of all, do you think I would spend time designing it if I didn't
> think so? :)
>   

I'll rephrase.  What are the substantial benefits that this offers over PCI?

> Second of all, I want to use vbus for other things that do not speak PCI
> natively (like userspace for instance...and if I am gleaning this
> correctly, lguest doesnt either).
>   

And virtio supports lguest and s390.  virtio is not PCI specific.

However, for the PC platform, PCI has distinct advantages.  What 
advantages does vbus have for the PC platform?

> PCI sounds good at first, but I believe its a false economy.  It was
> designed, of course, to be a hardware solution, so it carries all this
> baggage derived from hardware constraints that simply do not exist in a
> pure software world and that have to be emulated.  Things like the fixed
> length and centrally managed PCI-IDs, 

Not a problem in practice.

> PIO config cycles, BARs,
> pci-irq-routing, etc.  

What are the problems with these?

> While emulation of PCI is invaluable for
> executing unmodified guest, its not strictly necessary from a
> paravirtual software perspective...PV software is inherently already
> aware of its context and can therefore use the best mechanism
> appropriate from a broader selection of choices.
>   

It's also not necessary to invent a new bus.  We need a positive 
advantage, we don't do things just because we can (and then lose the 
real advantages PCI has).

> If we insist that PCI is the only interface we can support and we want
> to do something, say, in the kernel for instance, we have to have either
> something like the ICH model in the kernel (and really all of the pci
> chipset models that qemu supports), or a hacky hybrid userspace/kernel
> solution.  I think this is what you are advocating, but im sorry. IMO
> that's just gross and unecessary gunk.  

If we go for a kernel solution, a hybrid solution is the best IMO.  I 
have no idea what's wrong with it.

The guest would discover and configure the device using normal PCI 
methods.  Qemu emulates the requests, and configures the kernel part 
using normal Linux syscalls.  The nice thing is, kvm and the kernel part 
don't even know about each other, except for a way for hypercalls to 
reach the device and a way for interrupts to reach kvm.

> Lets stop beating around the
> bush and just define the 4-5 hypercall verbs we need and be done with
> it.  :)
>
> FYI: The guest support for this is not really *that* much code IMO.
>  
>  drivers/vbus/proxy/Makefile      |    2
>  drivers/vbus/proxy/kvm.c         |  726 +++++++++++++++++
>   

Does it support device hotplug and hotunplug?  Can vbus interrupts be 
load balanced by irqbalance?  Can guest userspace enumerate devices?  
Module autoloading support?  pxe booting?

Plus a port to Windows, enerprise Linux distros based on 2.6.dead, and 
possibly less mainstream OSes.

> and plus, I'll gladly maintain it :)
>
> I mean, its not like new buses do not get defined from time to time. 
> Should the computing industry stop coming up with new bus types because
> they are afraid that the windows ABI only speaks PCI?  No, they just
> develop a new driver for whatever the bus is and be done with it.  This
> is really no different.
>   

As a matter of fact, a new bus was developed recently called PCI 
express.  It uses new slots, new electricals, it's not even a bus 
(routers + point-to-point links), new everything except that the 
software model was 1000000000000% compatible with traditional PCI.  
That's how much people are afraid of the Windows ABI.

>> Note that virtio is not tied to PCI, so "vbus is generic" doesn't count.
>>     
> Well, preserving the existing virtio-net on x86 ABI is tied to PCI,
> which is what I was referring to.  Sorry for the confusion.
>   

virtio-net knows nothing about PCI.  If you have a problem with PCI, 
write virtio-blah for a new bus.  Though I still don't understand why.

  

>> I meant, move the development effort, testing, installed base, Windows
>> drivers.
>>     
>
> Again, I will maintain this feature, and its completely off to the
> side.  Turn it off in the config, or do not enable it in qemu and its
> like it never existed.  Worst case is it gets reverted if you don't like
> it.  Aside from the last few kvm specific patches, the rest is no
> different than the greater linux environment.  E.g. if I update the
> venet driver upstream, its conceptually no different than someone else
> updating e1000, right?
>   

I have no objections to you maintaining vbus, though I'd much prefer if 
we can pool our efforts and cooperate on having one good set of drivers.

I think you're integrating too tightly with kvm, which is likely to 
cause problems when kvm evolves.  The way I'd do it is:

- drop all mmu integration; instead, have your devices maintain their 
own slots layout and use copy_to_user()/copy_from_user() (or 
get_user_pages_fast()).
- never use vmap like structures for more than the length of a request
- for hypercalls, add kvm_register_hypercall_handler()
- for interrupts, see the interrupt routing thingie and have an 
in-kernel version of the KVM_IRQ_LINE ioctl.

This way, the parts that go into kvm know nothing about vbus, you're not 
pinning any memory, and the integration bits can be used for other 
purposes.

  

>> So why add something new?
>>     
>
> I was hoping this was becoming clear by now, but apparently I am doing a
> poor job of articulating things. :(  I think we got bogged down in the
> 802.x performance discussion and lost sight of what we are trying to
> accomplish with the core infrastructure.
>
> So this core vbus infrastructure is for generic, in-kernel IO models. 
> As a first pass, we have implemented a kvm-connector, which lets kvm
> guest kernels have access to the bus.  We also have a userspace
> connector (which I haven't pushed yet due to remaining issues being
> ironed out) which allows userspace applications to interact with the
> devices as well.  As a prototype, we built "venet" to show how it all works.
>
> In the future, we want to use this infrastructure to build IO models for
> various things like high performance fabrics and guest bypass
> technologies, etc.  For instance, guest userspace connections to RDMA
> devices in the kernel, etc.
>   

I think virtio can be used for much of the same things.  There's nothing 
in virtio that implies guest/host, or pci, or anything else.  It's 
similar to your shm/signal and ring abstractions except virtio folds 
them together.  Is this folding the main problem?

As far as I can tell, everything around it just duplicates existing 
infrastructure (which may be old and crusty, but so what) without added 
value.

>>
>> I don't want to develop and support both virtio and vbus.  And I
>> certainly don't want to depend on your customers.
>>     
>
> So don't.  Ill maintain the drivers and the infrastructure.  All we are
> talking here is the possible acceptance of my kvm-connector patches
> *after* the broader LKML community accepts the core infrastructure,
> assuming that happens.
>   

As I mentioned above, I'd much rather we cooperate rather than fragment 
the development effort (and user base).

Regarding kvm-connector, see my more generic suggestion above.  That 
would work for virtio-in-kernel as well.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html