[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53ABAA34.6070006@kernel.dk>
Date: Wed, 25 Jun 2014 23:05:56 -0600
From: Jens Axboe <axboe@...nel.dk>
To: Ming Lei <ming.lei@...onical.com>, linux-kernel@...r.kernel.org
CC: Rusty Russell <rusty@...tcorp.com.au>, linux-api@...r.kernel.org,
virtualization@...ts.linux-foundation.org,
"Michael S. Tsirkin" <mst@...hat.com>,
Stefan Hajnoczi <stefanha@...hat.com>,
Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [PATCH v2 0/2] block: virtio-blk: support multi vq per virtio-blk
On 2014-06-25 20:08, Ming Lei wrote:
> Hi,
>
> These patches try to support multi virtual queues(multi-vq) in one
> virtio-blk device, and maps each virtual queue(vq) to blk-mq's
> hardware queue.
>
> With this approach, both scalability and performance on virtio-blk
> device can get improved.
>
> For verifying the improvement, I implements virtio-blk multi-vq over
> qemu's dataplane feature, and both handling host notification
> from each vq and processing host I/O are still kept in the per-device
> iothread context, the change is based on qemu v2.0.0 release, and
> can be accessed from below tree:
>
> git://kernel.ubuntu.com/ming/qemu.git #v2.0.0-virtblk-mq.1
>
> For enabling the multi-vq feature, 'num_queues=N' need to be added into
> '-device virtio-blk-pci ...' of qemu command line, and suggest to pass
> 'vectors=N+1' to keep one MSI irq vector per each vq, and the feature
> depends on x-data-plane.
>
> Fio(libaio, randread, iodepth=64, bs=4K, jobs=N) is run inside VM to
> verify the improvement.
>
> I just create a small quadcore VM and run fio inside the VM, and
> num_queues of the virtio-blk device is set as 2, but looks the
> improvement is still obvious.
>
> 1), about scalability
> - without mutli-vq feature
> -- jobs=2, thoughput: 145K iops
> -- jobs=4, thoughput: 100K iops
> - with mutli-vq feature
> -- jobs=2, thoughput: 193K iops
> -- jobs=4, thoughput: 202K iops
>
> 2), about thoughput
> - without mutli-vq feature
> -- thoughput: 145K iops
> - with mutli-vq feature
> -- thoughput: 202K iops
Of these numbers, I think it's important to highlight that the 2 thread
case is 33% faster and the 2 -> 4 thread case scales linearly (100%)
while the pre-patch case sees negative scaling going from 2 -> 4 threads
(-39%).
I haven't run your patches yet, but from looking at the code, it looks
good. It's pretty straightforward. See feel free to add my reviewed-by.
Rusty, do you want to ack this (and I'll slurp it up for 3.17) or take
this yourself? Or something else?
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists