[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1344641012.22564.294.camel@haakon2.linux-iscsi.org>
Date: Fri, 10 Aug 2012 16:23:32 -0700
From: "Nicholas A. Bellinger" <nab@...ux-iscsi.org>
To: target-devel <target-devel@...r.kernel.org>,
linux-scsi <linux-scsi@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Cc: qemu-devel <qemu-devel@...gnu.org>,
kvm-devel <kvm@...r.kernel.org>,
lf-virt <virtualization@...ts.linux-foundation.org>,
"Michael S. Tsirkin" <mst@...hat.com>,
Stefan Hajnoczi <stefanha@...il.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Anthony Liguori <anthony@...emonkey.ws>,
Christoph Hellwig <hch@....de>, hare@...e.de,
James Bottomley <James.Bottomley@...senPartnership.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Jens Axboe <jaxboe@...ionio.com>
Subject: virtio-scsi <-> vhost multi lun/adapter performance results with
3.6-rc0
Hi folks,
The following are initial virtio-scsi + target vhost benchmark results
using multiple target LUNs per vhost and multiple virtio PCI adapters to
scale the total number of virtio-scsi LUNs into a single KVM guest.
The test setup is currently using 4x SCSI LUNs per vhost WWPN, with 8x
virtio PCI adapters for a total of 32x 500MB ramdisk LUNs into a single
guest, along with each backend setting emulate_write_cache=1 to expose
WCE=1 via virtio-scsi to SCSI core.
Using a KVM guest with 32x vCPUs and 4G memory, the results for 4x
random I/O now look like:
workload | jobs | 25% write / 75% read | 75% write / 25% read
-----------------|------|----------------------|---------------------
1x rd_mcp LUN | 8 | ~155K IOPs | ~145K IOPs
16x rd_mcp LUNs | 16 | ~315K IOPs | ~305K IOPs
32x rd_mcp LUNs | 16 | ~425K IOPs | ~410K IOPs
The full fio randrw results for the six test cases are attached below.
Also, using a workload of fio numjobs > 16 currently makes performance
start to fall off pretty sharply regardless of the number of vCPUs..
So running a similar workload with loopback SCSI ports on bare-metal
produces ~1M random IOPs with 12x LUNs + numjobs=32. At numjobs=16 here
with vhost the 16x LUN configuration ends up being in the range of ~310K
IOPs for the current sweet spot..
Here is a more detailed breakdown of the test setup:
- host hardware:
*) Dual Xeon-E5-2687W (Romley-EP) 3.10 Ghz w/ 32x threads +
32 GB of DDR3 1600Mhz memory
- host kernel:
*) Using 3.6-rc0 from target-pending/for-linus
*) qemu vhost-scsi from nab's qemu-kvm.git/vhost-scsi on k.o
*) Set QEMU vCPU process affinity to dedicated cpus based on
'info cpus' (as recommended by Stefan)
target backstores + vhost configuration from rtsadmin/targetcli shell:
/> ls backstores/rd_mcp/
o- rd_mcp ................................................. [32 Storage Objects]
o- ramdisk0 .............................................. [ramdisk activated]
o- ramdisk1 .............................................. [ramdisk activated]
o- ramdisk10 ............................................. [ramdisk activated]
o- ramdisk11 ............................................. [ramdisk activated]
o- ramdisk12 ............................................. [ramdisk activated]
o- ramdisk13 ............................................. [ramdisk activated]
o- ramdisk14 ............................................. [ramdisk activated]
o- ramdisk15 ............................................. [ramdisk activated]
o- ramdisk16 ............................................. [ramdisk activated]
o- ramdisk17 ............................................. [ramdisk activated]
o- ramdisk18 ............................................. [ramdisk activated]
o- ramdisk19 ............................................. [ramdisk activated]
o- ramdisk2 .............................................. [ramdisk activated]
o- ramdisk20 ............................................. [ramdisk activated]
o- ramdisk21 ............................................. [ramdisk activated]
o- ramdisk22 ............................................. [ramdisk activated]
o- ramdisk23 ............................................. [ramdisk activated]
o- ramdisk24 ............................................. [ramdisk activated]
o- ramdisk25 ............................................. [ramdisk activated]
o- ramdisk26 ............................................. [ramdisk activated]
o- ramdisk27 ............................................. [ramdisk activated]
o- ramdisk28 ............................................. [ramdisk activated]
o- ramdisk29 ............................................. [ramdisk activated]
o- ramdisk3 .............................................. [ramdisk activated]
o- ramdisk30 ............................................. [ramdisk activated]
o- ramdisk31 ............................................. [ramdisk activated]
o- ramdisk4 .............................................. [ramdisk activated]
o- ramdisk5 .............................................. [ramdisk activated]
o- ramdisk6 .............................................. [ramdisk activated]
o- ramdisk7 .............................................. [ramdisk activated]
o- ramdisk8 .............................................. [ramdisk activated]
o- ramdisk9 .............................................. [ramdisk activated]
/> ls vhost/
o- vhost ........................................................... [8 Targets]
o- naa.60014053fd613910 ............................... [naa.600140539e23ee71]
| o- luns ........................................................... [4 LUNs]
| o- lun0 ...................................... [rd_mcp/ramdisk5 (ramdisk)]
| o- lun1 ..................................... [rd_mcp/ramdisk23 (ramdisk)]
| o- lun2 ..................................... [rd_mcp/ramdisk24 (ramdisk)]
| o- lun3 ..................................... [rd_mcp/ramdisk25 (ramdisk)]
o- naa.60014058fd33725f ............................... [naa.6001405b0dcc8c8f]
| o- luns ........................................................... [4 LUNs]
| o- lun0 ...................................... [rd_mcp/ramdisk2 (ramdisk)]
| o- lun1 ..................................... [rd_mcp/ramdisk14 (ramdisk)]
| o- lun2 ..................................... [rd_mcp/ramdisk15 (ramdisk)]
| o- lun3 ..................................... [rd_mcp/ramdisk16 (ramdisk)]
o- naa.60014059af47b6a0 ............................... [naa.600140567c5ac7f1]
| o- luns ........................................................... [4 LUNs]
| o- lun0 ...................................... [rd_mcp/ramdisk0 (ramdisk)]
| o- lun1 ...................................... [rd_mcp/ramdisk8 (ramdisk)]
| o- lun2 ...................................... [rd_mcp/ramdisk9 (ramdisk)]
| o- lun3 ..................................... [rd_mcp/ramdisk10 (ramdisk)]
o- naa.6001405dfae0c05b ............................... [naa.6001405ce8ccfc96]
| o- luns ........................................................... [4 LUNs]
| o- lun0 ...................................... [rd_mcp/ramdisk4 (ramdisk)]
| o- lun1 ..................................... [rd_mcp/ramdisk20 (ramdisk)]
| o- lun2 ..................................... [rd_mcp/ramdisk21 (ramdisk)]
| o- lun3 ..................................... [rd_mcp/ramdisk22 (ramdisk)]
o- naa.6001405e0c55744e ............................... [naa.600140569bdc4c76]
| o- luns ........................................................... [4 LUNs]
| o- lun0 ...................................... [rd_mcp/ramdisk3 (ramdisk)]
| o- lun1 ..................................... [rd_mcp/ramdisk17 (ramdisk)]
| o- lun2 ..................................... [rd_mcp/ramdisk18 (ramdisk)]
| o- lun3 ..................................... [rd_mcp/ramdisk19 (ramdisk)]
o- naa.6001405e6b23dd27 ............................... [naa.600140503c996b52]
| o- luns ........................................................... [4 LUNs]
| o- lun0 ...................................... [rd_mcp/ramdisk7 (ramdisk)]
| o- lun1 ..................................... [rd_mcp/ramdisk29 (ramdisk)]
| o- lun2 ..................................... [rd_mcp/ramdisk30 (ramdisk)]
| o- lun3 ..................................... [rd_mcp/ramdisk31 (ramdisk)]
o- naa.6001405e86ec6fbc ............................... [naa.6001405948737af9]
| o- luns ........................................................... [4 LUNs]
| o- lun0 ...................................... [rd_mcp/ramdisk6 (ramdisk)]
| o- lun1 ..................................... [rd_mcp/ramdisk26 (ramdisk)]
| o- lun2 ..................................... [rd_mcp/ramdisk27 (ramdisk)]
| o- lun3 ..................................... [rd_mcp/ramdisk28 (ramdisk)]
o- naa.6001405f2a1036cf ............................... [naa.6001405a44e2740d]
o- luns ........................................................... [4 LUNs]
o- lun0 ...................................... [rd_mcp/ramdisk1 (ramdisk)]
o- lun1 ..................................... [rd_mcp/ramdisk11 (ramdisk)]
o- lun2 ..................................... [rd_mcp/ramdisk12 (ramdisk)]
o- lun3 ..................................... [rd_mcp/ramdisk13 (ramdisk)]
- guest kernel:
*) 3.5.0-rc2 from target-pending/for-next-merge w/ virtio-scsi LUN scan
bugfix applied
*) Use nop scheduler for all virtio-scsi LUNs
*) Set virtio-queue IRQ affinity to dedicated vCPUs
QEMU cli opts:
./x86_64-softmmu/qemu-system-x86_64 -enable-kvm -smp 32 -m 4096 -serial
file:/tmp/vhost-serial.txt
-hda /usr/src/debian_squeeze_amd64_standard-old.qcow2 -vhost-scsi
id=vhost-scsi0,wwpn=naa.60014059af47b6a0,tpgt=1 -device
virtio-scsi-pci,vhost-scsi=vhost-scsi0,event_idx=off -vhost-scsi
id=vhost-scsi1,wwpn=naa.6001405f2a1036cf,tpgt=1 -device
virtio-scsi-pci,vhost-scsi=vhost-scsi1,event_idx=off -vhost-scsi
id=vhost-scsi2,wwpn=naa.60014058fd33725f,tpgt=1 -device
virtio-scsi-pci,vhost-scsi=vhost-scsi2,event_idx=off -vhost-scsi
id=vhost-scsi3,wwpn=naa.6001405e0c55744e,tpgt=1 -device
virtio-scsi-pci,vhost-scsi=vhost-scsi3,event_idx=off -vhost-scsi
id=vhost-scsi4,wwpn=naa.6001405dfae0c05b,tpgt=1 -device
virtio-scsi-pci,vhost-scsi=vhost-scsi4,event_idx=off -vhost-scsi
id=vhost-scsi5,wwpn=naa.60014053fd613910,tpgt=1 -device
virtio-scsi-pci,vhost-scsi=vhost-scsi5,event_idx=off -vhost-scsi
id=vhost-scsi6,wwpn=naa.6001405e86ec6fbc,tpgt=1 -device
virtio-scsi-pci,vhost-scsi=vhost-scsi6,event_idx=off -vhost-scsi
id=vhost-scsi7,wwpn=naa.6001405e6b23dd27,tpgt=1 -device
virtio-scsi-pci,vhost-scsi=vhost-scsi7,event_idx=off
Using the following fio base options:
[randrw]
rw=randrw
rwmixwrite=25
rwmixread=75
ioengine=libaio
direct=1
size=100G
iodepth=64
iodepth_batch=4
iodepth_batch_complete=32
numjobs=32
blocksize=4k
filename=/dev/sdb
filename=/dev/sdc
filename=/dev/sdd
filename=/dev/sde
filename=/dev/sdf
filename=/dev/sdg
filename=/dev/sdh
filename=/dev/sdi
filename=/dev/sdj
filename=/dev/sdk
filename=/dev/sdl
filename=/dev/sdm
filename=/dev/sdn
filename=/dev/sdo
filename=/dev/sdp
filename=/dev/sdq
filename=/dev/sdr
filename=/dev/sds
filename=/dev/sdt
filename=/dev/sdu
filename=/dev/sdv
filename=/dev/sdw
filename=/dev/sdx
filename=/dev/sdy
filename=/dev/sdz
filename=/dev/sdaa
filename=/dev/sdab
filename=/dev/sdac
filename=/dev/sdad
filename=/dev/sdae
filename=/dev/sdaf
filename=/dev/sdag
Guest lsscsi output:
[0:0:0:0] disk ATA QEMU HARDDISK 1.1. /dev/sda
[1:0:0:0] cd/dvd QEMU QEMU DVD-ROM 1.1. /dev/sr0
[2:0:0:0] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdb
[2:0:0:1] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdc
[2:0:0:2] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdd
[2:0:0:3] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sde
[3:0:0:0] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdf
[3:0:0:1] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdg
[3:0:0:2] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdh
[3:0:0:3] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdi
[4:0:0:0] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdj
[4:0:0:1] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdk
[4:0:0:2] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdl
[4:0:0:3] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdm
[5:0:0:0] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdn
[5:0:0:1] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdo
[5:0:0:2] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdp
[5:0:0:3] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdq
[6:0:0:0] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdr
[6:0:0:1] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sds
[6:0:0:2] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdt
[6:0:0:3] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdu
[7:0:0:0] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdv
[7:0:0:1] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdw
[7:0:0:2] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdx
[7:0:0:3] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdy
[8:0:0:0] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdz
[8:0:0:1] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdaa
[8:0:0:2] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdab
[8:0:0:3] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdac
[9:0:0:0] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdad
[9:0:0:1] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdae
[9:0:0:2] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdaf
[9:0:0:3] disk LIO-ORG RAMDISK-MCP 4.0 /dev/sdag
and the relevant virtio PCI layout:
00:04.0 SCSI storage controller: Red Hat, Inc Device 1004
Subsystem: Red Hat, Inc Device 0008
Flags: bus master, fast devsel, latency 0, IRQ 11
I/O ports at c040 [size=64]
Memory at febf1000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [40] MSI-X: Enable+ Count=2 Masked-
Kernel driver in use: virtio-pci
00:05.0 SCSI storage controller: Red Hat, Inc Device 1004
Subsystem: Red Hat, Inc Device 0008
Flags: bus master, fast devsel, latency 0, IRQ 10
I/O ports at c080 [size=64]
Memory at febf2000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [40] MSI-X: Enable+ Count=2 Masked-
Kernel driver in use: virtio-pci
00:06.0 SCSI storage controller: Red Hat, Inc Device 1004
Subsystem: Red Hat, Inc Device 0008
Flags: bus master, fast devsel, latency 0, IRQ 10
I/O ports at c0c0 [size=64]
Memory at febf3000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [40] MSI-X: Enable+ Count=2 Masked-
Kernel driver in use: virtio-pci
00:07.0 SCSI storage controller: Red Hat, Inc Device 1004
Subsystem: Red Hat, Inc Device 0008
Flags: bus master, fast devsel, latency 0, IRQ 11
I/O ports at c100 [size=64]
Memory at febf4000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [40] MSI-X: Enable+ Count=2 Masked-
Kernel driver in use: virtio-pci
00:08.0 SCSI storage controller: Red Hat, Inc Device 1004
Subsystem: Red Hat, Inc Device 0008
Flags: bus master, fast devsel, latency 0, IRQ 11
I/O ports at c140 [size=64]
Memory at febf5000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [40] MSI-X: Enable+ Count=2 Masked-
Kernel driver in use: virtio-pci
00:09.0 SCSI storage controller: Red Hat, Inc Device 1004
Subsystem: Red Hat, Inc Device 0008
Flags: bus master, fast devsel, latency 0, IRQ 10
I/O ports at c180 [size=64]
Memory at febf6000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [40] MSI-X: Enable+ Count=2 Masked-
Kernel driver in use: virtio-pci
00:0a.0 SCSI storage controller: Red Hat, Inc Device 1004
Subsystem: Red Hat, Inc Device 0008
Flags: bus master, fast devsel, latency 0, IRQ 10
I/O ports at c1c0 [size=64]
Memory at febf7000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [40] MSI-X: Enable+ Count=2 Masked-
Kernel driver in use: virtio-pci
00:0b.0 SCSI storage controller: Red Hat, Inc Device 1004
Subsystem: Red Hat, Inc Device 0008
Flags: bus master, fast devsel, latency 0, IRQ 11
I/O ports at c200 [size=64]
Memory at febf8000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [40] MSI-X: Enable+ Count=2 Masked-
Kernel driver in use: virtio-pci
View attachment "fio-1x-rd_mcp-25-75-4k.txt" of type "text/plain" (9343 bytes)
View attachment "fio-1x-rd_mcp-75-25-4k.txt" of type "text/plain" (9383 bytes)
View attachment "fio-16x-rd_mcp-25-75-4k.txt" of type "text/plain" (20703 bytes)
View attachment "fio-16x-rd_mcp-75-25-4k.txt" of type "text/plain" (20676 bytes)
View attachment "fio-32x-rd_mcp-25-75-4k.txt" of type "text/plain" (22525 bytes)
View attachment "fio-32x-rd_mcp-75-25-4k.txt" of type "text/plain" (22068 bytes)
Powered by blists - more mailing lists