linux-kernel - Re: [PATCH] SCSI driver for VMware's virtual HBA.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1251903967.3892.177.camel@mulgrave.site>
Date:	Wed, 02 Sep 2009 10:06:07 -0500
From:	James Bottomley <James.Bottomley@...e.de>
To:	akataria@...are.com
Cc:	Dmitry Torokhov <dtor@...are.com>, Matthew Wilcox <matthew@....cx>,
	Roland Dreier <rdreier@...co.com>,
	Bart Van Assche <bvanassche@....org>,
	Robert Love <robert.w.love@...el.com>,
	Randy Dunlap <randy.dunlap@...cle.com>,
	Mike Christie <michaelc@...wisc.edu>,
	"linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Rolf Eike Beer <eike-kernel@...tec.de>,
	Maxime Austruy <maustruy@...are.com>
Subject: Re: [PATCH] SCSI driver for VMware's virtual HBA.

On Tue, 2009-09-01 at 19:55 -0700, Alok Kataria wrote:
> On Tue, 2009-09-01 at 11:15 -0700, James Bottomley wrote:
> > On Tue, 2009-09-01 at 10:41 -0700, Alok Kataria wrote:
> > > > lguest uses the sg_ring abstraction.  Xen and KVM were certainly looking
> > > > at this too.
> > > 
> > > I don't see the sg_ring abstraction that you are talking about. Can you
> > > please give me some pointers. 
> > 
> > it's in drivers/lguest ... apparently it's vring now and the code is in
> > driver/virtio
> > 
> > > Also regarding Xen and KVM I think they are using the xenbus/vbus
> > > interface, which is quite different than what we do here. 
> > 
> > Not sure about Xen ... KVM uses virtio above.
> > 
> > > > 
> > > > > And anyways how large is the DMA code that we are worrying about here ?
> > > > > Only about 300-400 LOC ? I don't think we might want to over-design for
> > > > > such small gains.
> > > > 
> > > > So even if you have different DMA code, the remaining thousand or so
> > > > lines would be in common.  That's a worthwhile improvement.
> 
> I don't see how, the rest of the code comprises of IO/MMIO  space & ring
> processing which is very different in each of the implementations. What
> is left is the setup and initialization code which obviously depends on
> the implementation of the driver data structures. 

Are there benchmarks comparing the two approaches?

> > > And not just that, different HV-vendors can have different features,
> > > like say XYZ can come up tomorrow and implement the multiple rings
> > > interface so the feature set doesn't remain common and we will have less
> > > code to share in the not so distant future.
> > 
> > Multiple rings is really just a multiqueue abstraction.  That's fine,
> > but it needs a standard multiqueue control plane.
> > 
> > The desire to one up the competition by adding a new whiz bang feature
> > to which you code a special interface is very common in the storage
> > industry.  The counter pressure is that consumers really like these
> > things standardised.  That's what the transport class abstraction is all
> > about.
> > 
> > We also seem to be off on a tangent about hypervisor interfaces.  I'm
> > actually more interested in the utility of an SRP abstraction or at
> > least something SAM based.  It seems that in your driver you don't quite
> > do the task management functions as SAM requests, but do them over your
> > own protocol abstractions.
> 
> Okay,  I think I need to take a step back here and understand what
> actually are you asking for.
> 
> 1. What do you mean by the "transport class abstraction" ? 
> Do you mean that the way we communicate with the hypervisor needs to be
> standardized ?

Not really.  Transport classes are designed to share code and provide a
uniform control plane when the underlying implementation is different.

> 2. Are you saying that we should use the virtio ring mechanism to handle
> our request and completion rings ? 

That's an interesting question.  Virtio is currently the standard linux
guest<=>hypervisor communication mechanism, but if you have comparative
benchmarks showing that virtual hardware emulation is faster, it doesn't
need to remain so.

>   We can not do that. Our backend expects that each slot on the ring is
> in a particular format. Where as vring expects that each slot on the
> vring is in the vring_desc format.

Your backend is a software server, surely?

> 3. Also, the way we communicate with the hypervisor backend is that the
> driver writes to our device IO registers in a particular format. The
> format that we follow is to first write the command on the
> COMMAND_REGISTER and then write a stream of data words in the
> DATA_REGISTER, which is a normal device interface.
> The reason I make this point is to highlight we are not making any
> hypercalls instead we communicate with the hypervisor by writing to
> IO/Memory mapped regions.  So from that perspective the driver has no
> knowledge that its is talking to a software backend (aka device
> emulation) instead it is very similar to how a driver talks to a silicon
> device.  The backend expects things in a certain way and we cannot
> really change that interface ( i.e. the ABI shared between Device driver
> and Device Emulation).
> 
> So sharing code with vring or virtio is not something that works well
> with our backend. The VMware PVSCSI driver is simply a virtual HBA and
> shouldn't be looked at any differently.
> 
> Is their anything else that you are asking us to standardize ?

I'm not really asking you to standardise anything (yet).  I was more
probing for why you hadn't included any of the SCSI control plane
interfaces and what lead you do produce a different design from the
current patterns in virtual I/O.  I think what I'm hearing is "Because
we didn't look at how modern SCSI drivers are constructed" and "Because
we didn't look at how virtual I/O is currently done in Linux".  That's
OK (it's depressingly familiar in drivers), but now we get to figure out
what, if anything, makes sense from a SCSI control plane to a hypervisor
interface and whether this approach to hypervisor interfaces is better
or worse than virtio.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/