lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 2 Oct 2015 09:57:26 +0000
From:	Rafal Mielniczuk <rafal.mielniczuk@...rix.com>
To:	Bob Liu <bob.liu@...cle.com>,
	"xen-devel@...ts.xen.org" <xen-devel@...ts.xen.org>
CC:	David Vrabel <david.vrabel@...rix.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Roger Pau Monne <roger.pau@...rix.com>,
	"konrad.wilk@...cle.com" <konrad.wilk@...cle.com>,
	Felipe Franciosi <felipe.franciosi@...rix.com>,
	"axboe@...com" <axboe@...com>,
	"hch@...radead.org" <hch@...radead.org>,
	"avanzini.arianna@...il.com" <avanzini.arianna@...il.com>,
	"boris.ostrovsky@...cle.com" <boris.ostrovsky@...cle.com>,
	Jonathan Davies <Jonathan.Davies@...rix.com>
Subject: Re: [PATCH v3 0/9] xen-block: support multi hardware-queues/rings

On 05/09/15 13:40, Bob Liu wrote:
> Note: These patches were based on original work of Arianna's internship for
> GNOME's Outreach Program for Women.
>
> The first patch which just convert xen-blkfront driver to use blk-mq api has
> been applied by David.
>
> After using blk-mq api, a guest has more than one(nr_vpus) software request
> queues associated with each block front. These queues can be mapped over several
> rings(hardware queues) to the backend, making it very easy for us to run
> multiple threads on the backend for a single virtual disk.
>
> By having different threads issuing requests at the same time, the performance
> of guest can be improved significantly in the end.
>
> Test was done based on null_blk driver:
> dom0: v4.2-rc8 16vcpus 10GB "modprobe null_blk"
> domu: v4.2-rc8 16vcpus 10GB
>
> [test]
> rw=read or randread
> direct=1
> ioengine=libaio
> bs=4k
> time_based
> runtime=30
> filename=/dev/xvdb
> numjobs=16
> iodepth=64
> iodepth_batch=64
> iodepth_batch_complete=64
> group_reporting
>
> Seqread:
> 	dom0 	domU(no_mq) 	domU(4 queues) 	 8 queues 	16 queues
> iops:  1308k        690k        1380k(+200%)        1238k           1471k
>
> Randread:
> 	dom0 	domU(no_mq) 	domU(4 queues) 	 8 queues 	16 queues
> iops:  1310k        279k        810k(+200%)          871k           1000k
>
> Only with 4queues, iops for domU get improved a lot and nearly catch up with
> dom0. There were also similar huge improvement for write and real SSD storage.
>
> ---
> v3: Rebased to v4.2-rc8
>
> Bob Liu (9):
>   xen-blkfront: convert to blk-mq APIs
>   xen-block: add document for mutli hardware queues/rings
>   xen/blkfront: separate per ring information out of device info
>   xen/blkfront: pseudo support for multi hardware queues/rings
>   xen/blkfront: convert per device io_lock to per ring ring_lock
>   xen/blkfront: negotiate the number of hw queues/rings with backend
>   xen/blkback: separate ring information out of struct xen_blkif
>   xen/blkback: pseudo support for multi hardware queues/rings
>   xen/blkback: get number of hardware queues/rings from blkfront
>
>  drivers/block/xen-blkback/blkback.c |  373 +++++-----
>  drivers/block/xen-blkback/common.h  |   53 +-
>  drivers/block/xen-blkback/xenbus.c  |  376 ++++++----
>  drivers/block/xen-blkfront.c        | 1343 ++++++++++++++++++++---------------
>  include/xen/interface/io/blkif.h    |   32 +
>  5 files changed, 1278 insertions(+), 899 deletions(-)
>
Hello,

Following are the results for sequential reads executed on the guest with the Intel P3700 SSD dom0 backend equipped with 16 hardware queues, 
which makes it a good candidate for the multi-queue measurements.

dom0: v4.2 16vcpus 4GB
domU: v4.2 16vcpus 10GB

fio --name=test --ioengine=libaio \
    --time_based=1 --runtime=30 --ramp_time=15 \
    --filename=/dev/xvdc --direct=1 --group_reporting=1 \
    --iodepth=16 --iodepth_batch=16 --iodepth_batch_complete=16 \
    --numjobs=16 --rw=read --bs=$bs

        1 queue     2 queues    4 queues    8 queues    16 queues
512     583K        757K        930K        995K        976K
1K      557K        832K        908K        931K        956K
2K      585K        794K        927K        975K        948K
4K      546K        709K        700K        754K        820K
8K      357K        414K        414K        414K        414K
16K     172K        194K        207K        207K        207K
32K     91K         99K         103K        103K        103K
64K     42K         51K         51K         51K         51K
128K    21K         25K         25K         25K         25K

With the increasing number of queues in the blkfront we see a gradual improvement in the number of iops,
especially for the small block sizes, as with larger block sizes we hit the limitations of the disk quicker.

Rafal

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ