[<prev] [next>] [day] [month] [year] [list]
Message-ID: <540058CB.2030704@parallels.com>
Date: Fri, 29 Aug 2014 14:41:15 +0400
From: Maxim Patlasov <mpatlasov@...allels.com>
To: Zach Brown <zab@...bo.net>
CC: Ming Lei <ming.lei@...onical.com>,
Benjamin LaHaise <bcrl@...ck.org>,
"axboe@...nel.dk" <axboe@...nel.dk>,
Christoph Hellwig <hch@...radead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Dave Kleikamp <dave.kleikamp@...cle.com>,
Kent Overstreet <kmo@...erainc.com>, open list:
AIO <linux-aio@...ck.org>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
Dave Chinner <david@...morbit.com>, ;
Subject: Re: [PATCH v1 5/9] block: loop: convert to blk-mq
On 8/28/14, Zach Brown<zab@...bo.net> wrote:
> On Wed, Aug 27, 2014 at 09:19:36PM +0400, Maxim Patlasov wrote:
>> On 08/27/2014 08:29 PM, Benjamin LaHaise wrote:
>>> On Wed, Aug 27, 2014 at 08:08:59PM +0400, Maxim Patlasov wrote:
>>> ...
>>>> 1) /dev/loop0 of 3.17.0-rc1 with Ming's patches applied -- 11K iops
>>>> 2) the same as above, but call loop_queue_work() directly from
>>>> loop_queue_rq() -- 270K iops
>>>> 3) /dev/nullb0 of 3.17.0-rc1 -- 380K iops
>>>>
>>>> Taking into account so big difference (11K vs. 270K), would it be
>>>> worthy
>>>> to implement pure non-blocking version of aio_kernel_submit() returning
>>>> error if blocking needed? Then loop driver (or any other in-kernel
>>>> user)
>>>> might firstly try that non-blocking submit as fast-path, and, only if
>>>> it's failed, fall back to queueing.
>>> What filesystem is the backing file for loop0 on? O_DIRECT access as
>>> Ming's patches use should be non-blocking, and if not, that's something
>>> to fix.
>> I used loop0 directly on top of null_blk driver (because my goal was to
>> measure the overhead of processing requests in a separate thread).
> The relative overhead while doing nothing else. While zooming way down
> in to micro benchmarks is fun and all, testing on an fs on brd might be
> more representitive and so more compelling.
The measurements on an fs on brd are even more outrageous (the same fio
script I posted a few messages above):
1) Baseline. no loopback device involved.
fio on /dev/ram0: 467K iops
fio on ext4 over /dev/ram0: 378K iops
2) Loopback device from 3.17.0-rc1 with Ming's patches (v1) applied:
fio on /dev/loop0 over /dev/ram0: 10K iops
fio on ext4 over /dev/loop0 over /dev/ram0: 9K iops
3) the same as above, but avoid extra context switch (call
loop_queue_work() directly from loop_queue_rq()):
fio on /dev/loop0 over /dev/ram0: 267K iops
fio on ext4 over /dev/loop0 over /dev/ram0: 223K iops
The problem is not about huge relative overhead while doing nothing
else. It's rather about introducing extra latency (~100 microseconds on
commodity h/w I used) which might be noticeable on modern SSDs (and h/w
RAIDs with caching).
Thanks,
Maxim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists