linux-kernel - Re: [PATCH 5/5] virtio-scsi: introduce multiqueue support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 4 Sep 2012 11:46:28 +0300
From:	"Michael S. Tsirkin" <mst@...hat.com>
To:	Paolo Bonzini <pbonzini@...hat.com>
Cc:	"Nicholas A. Bellinger" <nab@...ux-iscsi.org>,
	linux-kernel@...r.kernel.org, linux-scsi@...r.kernel.org,
	kvm@...r.kernel.org, rusty@...tcorp.com.au, jasowang@...hat.com,
	virtualization@...ts.linux-foundation.org,
	Christoph Hellwig <hch@....de>, Jens Axboe <axboe@...nel.dk>,
	target-devel <target-devel@...r.kernel.org>
Subject: Re: [PATCH 5/5] virtio-scsi: introduce multiqueue support

On Tue, Sep 04, 2012 at 08:46:12AM +0200, Paolo Bonzini wrote:
> Il 04/09/2012 04:21, Nicholas A. Bellinger ha scritto:
> >> @@ -112,6 +118,9 @@ static void virtscsi_complete_cmd(struct virtio_scsi *vscsi, void *buf)
> >>  	struct virtio_scsi_cmd *cmd = buf;
> >>  	struct scsi_cmnd *sc = cmd->sc;
> >>  	struct virtio_scsi_cmd_resp *resp = &cmd->resp.cmd;
> >> +	struct virtio_scsi_target_state *tgt = vscsi->tgt[sc->device->id];
> >> +
> >> +	atomic_dec(&tgt->reqs);
> >>  
> > 
> > As tgt->tgt_lock is taken in virtscsi_queuecommand_multi() before the
> > atomic_inc_return(tgt->reqs) check, it seems like using atomic_dec() w/o
> > smp_mb__after_atomic_dec or tgt_lock access here is not using atomic.h
> > accessors properly, no..?
> 
> No, only a single "thing" is being accessed, and there is no need to
> order the decrement with respect to preceding or subsequent accesses to
> other locations.
>
> In other words, tgt->reqs is already synchronized with itself, and that
> is enough.

I think your logic is correct and barrier is not needed,
but this needs better documentation.

> (Besides, on x86 smp_mb__after_atomic_dec is a nop).
> >> +static int virtscsi_queuecommand_multi(struct Scsi_Host *sh,
> >> +				       struct scsi_cmnd *sc)
> >> +{
> >> +	struct virtio_scsi *vscsi = shost_priv(sh);
> >> +	struct virtio_scsi_target_state *tgt = vscsi->tgt[sc->device->id];
> >> +	unsigned long flags;
> >> +	u32 queue_num;
> >> +
> >> +	/* Using an atomic_t for tgt->reqs lets the virtqueue handler
> >> +	 * decrement it without taking the spinlock.
> >> +	 */

Above comment is not really helpful - reader can be safely assumed to
know what atomic_t is.

Please delete, and replace with the text from commit log
that explains the heuristic used to select req_vq.

Also please add a comment near 'reqs' definition.
Something like "number of outstanding requests - used to detect idle
target".


> >> +	spin_lock_irqsave(&tgt->tgt_lock, flags);

Looks like this lock can be removed - req_vq is only
modified when target is idle and only used when it is
not idle.

> >> +	if (atomic_inc_return(&tgt->reqs) == 1) {
> >> +		queue_num = smp_processor_id();
> >> +		while (unlikely(queue_num >= vscsi->num_queues))
> >> +			queue_num -= vscsi->num_queues;
> >> +		tgt->req_vq = &vscsi->req_vqs[queue_num];
> >> +	}
> >> +	spin_unlock_irqrestore(&tgt->tgt_lock, flags);
> >> +	return virtscsi_queuecommand(vscsi, tgt, sc);
> >> +}
> >> +
> >> +

.....

> >> +static int virtscsi_queuecommand_single(struct Scsi_Host *sh,
> >> +                                       struct scsi_cmnd *sc)
> >> +{
> >> +       struct virtio_scsi *vscsi = shost_priv(sh);
> >> +       struct virtio_scsi_target_state *tgt = vscsi->tgt[sc->device->id];
> >> +
> >> +       atomic_inc(&tgt->reqs);
> >> +       return virtscsi_queuecommand(vscsi, tgt, sc);
> >> +}
> >> +

Here, reqs is unused - why bother incrementing it?
A branch on completion would be cheaper IMHO.


>virtio-scsi multiqueue has a performance benefit up to 20%

To be fair, you could be running in single queue mode.
In that case extra atomics and indirection that this code
brings will just add overhead without benefits.
I don't know how significant would that be.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/