linux-kernel - Re: virtio scsi host draft specification, v3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4E0D73C4.5090608@suse.de>
Date:	Fri, 01 Jul 2011 09:14:12 +0200
From:	Hannes Reinecke <hare@...e.de>
To:	Paolo Bonzini <pbonzini@...hat.com>
Cc:	Stefan Hajnoczi <stefanha@...il.com>,
	Christoph Hellwig <chellwig@...hat.com>,
	Stefan Hajnoczi <stefanha@...ux.vnet.ibm.com>,
	kvm@...r.kernel.org, "Michael S. Tsirkin" <mst@...hat.com>,
	qemu-devel <qemu-devel@...gnu.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Linux Virtualization <virtualization@...ts.linux-foundation.org>,
	"Nicholas A. Bellinger" <nab@...ux-iscsi.org>
Subject: Re: virtio scsi host draft specification, v3

On 07/01/2011 08:41 AM, Paolo Bonzini wrote:
> On 06/29/2011 11:39 AM, Stefan Hajnoczi wrote:
>> > > Of course, when doing so we would be lose the ability to
>> freely remap
>> > > LUNs. But then remapping LUNs doesn't gain you much imho.
>> > > Plus you could always use qemu block backend here if you want
>> > > to hide the details.
>> >
>> > And you could always use the QEMU block backend with
>> > scsi-generic if you want to remap LUNs, instead of true
 >> > passthrough via the kernel target.
>>
>> IIUC the in-kernel target always does remapping. It passes through
>> individual LUNs rather than entire targets and you pick LU Numbers to
>> map to the backing storage (which may or may not be a SCSI
>> pass-through device). Nicholas Bellinger can confirm whether this is
>> correct.
>
> But then I don't understand. If you pick LU numbers both with the
> in-kernel target and with QEMU, you do not need to use e.g. WWPNs
> with fiber channel, because we are not passing through the details
> of the transport protocol (one day we might have virtio-fc, but more
> likely not). So the LUNs you use might as well be represented by
> hierarchical LUNs.
>

Actually, the kernel does _not_ do a LUN remapping. It just so 
happens that most storage arrays will present the LUN starting with 
0, so normally you wouldn't notice.

However, some arrays have an array-wide LUN range, so you start 
seeing LUNs at odd places:

[3:0:5:0]    disk    LSI      INF-01-00        0750  /dev/sdw
[3:0:5:7]    disk    LSI      Universal Xport  0750  /dev/sdx

> Using NPIV with KVM would be done by mapping the same virtual N_Port
> ID in the host(s) to the same LU number in the guest. You might
> already do this now with virtio-blk, in fact.
>
The point here is not the mapping. The point is rescanning.

You can map existing NPIV devices already. But you _cannot_ rescan
the host/device whatever _from the guest_ to detect if new devices
are present.
That is the problem I'm trying to describe here.

To be more explicit:
Currently you have to map existing devices directly as individual 
block or scsi devices to the guest.
And rescan within the guest can only be sent to that device, so the 
only information you will get able to gather is if the device itself 
is still present.
You are unable to detect if there are other devices attached to your 
guest which you should connect to.

So we have to have an enclosing instance (ie the equivalent of a 
SCSI target), which is capable of telling us exactly this.

> Put in another way: the virtio-scsi device is itself a SCSI target,
> so yes, there is a single target port identifier in virtio-scsi. But
> this SCSI target just passes requests down to multiple real targets,
> and so will let you do ALUA and all that.
>
Argl. No way. The virtio-scsi device has to map to a single LUN.

I thought I mentioned this already, but I'd better clarify this again:

The SCSI spec itself only deals with LUNs, so anything you'll read 
in there obviously will only handle the interaction between the 
initiator (read: host) and the LUN itself. However, the actual 
command is send via an intermediat target, hence you'll always see 
the reference to the ITL (initiator-target-lun) nexus.
The SCSI spec details discovery of the individual LUNs presented by 
a given target, it does _NOT_ detail the discovery of the targets 
themselves.
That is being delegated to the underlying transport, in most cases 
SAS or FibreChannel.
For the same reason the SCSI spec can afford to disdain any 
reference to path failure, device hot-plugging etc; all of these 
things are being delegated to the transport.

In our context the virtio-scsi device should map to the LUN, and the 
virtio-scsi _host_ backend should map to the target.
The virtio-scsi _guest_ driver will then map to the initiator.

So we should be able to attach more than one device to the backend,
which then will be presented to the initiator.

In the case of NPIV it would make sense to map the virtual SCSI host 
to the backend, so that all devices presented to the virtual SCSI 
host will be presented to the backend, too.
However, when doing so these devices will normally be referenced by 
their original LUN, as these will be presented to the guest via eg 
'REPORT LUNS'.

The above thread now tries to figure out if we should remap those 
LUN numbers or just expose them as they are.
If we decide on remapping, we have to emulate _all_ commands 
referring explicitely to those LUN numbers (persistent reservations, 
anyone?). If we don't, we would expose some hardware detail to the 
guest, but would save us _a lot_ of processing.

I'm all for the latter.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@...e.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/