[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20111005195403.407628164@bombadil.infradead.org>
Date: Wed, 05 Oct 2011 15:54:03 -0400
From: Christoph Hellwig <hch@...radead.org>
To: Rusty Russell <rusty@...tcorp.com.au>
Cc: Chris Wright <chrisw@...s-sol.org>, Jens Axboe <axboe@...nel.dk>,
Stefan Hajnoczi <stefanha@...ux.vnet.ibm.com>,
kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: [PATCH 0/5] RFC: ->make_request support for virtio-blk
This patchset allows the virtio-blk driver to support much higher IOP
rates which can be driven out of modern PCI-e flash devices. At this
point it really is just a RFC due to various issues.
The first four patches are infrastructure that could go in fairly
soon as far as I'm concerned. Patch 5 implements the actual ->make_request
support and still has a few issues, see there for more details. With
it I can driver my PCI-e test devices to 85-90% of the native IOPS
and bandwith, but be warned that this is still a fairly low end setup
as far as expensive flash storage is concerned.
One big downside that is has is that it current exposes a nasty race
in the qemu virtqueue code - just running xfstests inside a guest
using the new virtio-blk driver (even on a slow device) will trigger
it and lead to a filesystem shutdown. I've tracked it down to getting
data I/O segments overwritten with status s/g list entries, but got
lost at that point. I can start a separate thread on it.
Besides that it is missing a few features, and we have to decided
how to select which mode to use in virtio-blk - either a module option,
sysfs attribute or something that the host communicates. Or maybe
decide that just going with ->make_request alone is fine, even on
my cheap laptop SSD it actually is just as fast if not slightly
faster than the request based variant on my laptop.
There are a few other bottlenecks in virtio that this exposes. The
first one is the low queue length of just 128 entries in the virtio-blk
queue - to drive higher IOPs with a deep queue we absolutely need
to increment that.
Comments welcome!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists