[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <x49d3bomv7w.fsf@segfault.boston.devel.redhat.com>
Date: Fri, 16 Dec 2011 09:45:07 -0500
From: Jeff Moyer <jmoyer@...hat.com>
To: Chris Mason <chris.mason@...cle.com>
Cc: Dave Kleikamp <dave.kleikamp@...cle.com>, linux-aio@...ck.org,
linux-kernel@...r.kernel.org, Jens Axboe <axboe@...nel.dk>,
Andi Kleen <ak@...ux.intel.com>
Subject: Re: [PATCH] AIO: Don't plug the I/O queue in do_io_submit()
Chris Mason <chris.mason@...cle.com> writes:
> On Tue, Dec 13, 2011 at 05:26:07PM -0600, Dave Kleikamp wrote:
>> On 12/13/2011 04:18 PM, Jeff Moyer wrote:
>> > Dave Kleikamp <dave.kleikamp@...cle.com> writes:
>> >
>> >> Asynchronous I/O latency to a solid-state disk greatly increased
>> >> between the 2.6.32 and 3.0 kernels. By removing the plug from
>> >> do_io_submit(), we observed a 34% improvement in the I/O latency.
>> >>
>> >> Unfortunately, at this level, we don't know if the request is to
>> >> a rotating disk or not.
>> >
>> > I'm guessing I know the answer to this, but what workload were you
>> > testing, and can you provide more concrete evidence than "latency
>> > greatly increased?"
>>
>> It is a piece of a larger industry-standard benchmark and you're
>> probably guessing correctly. The "greatly increased" latency was
>> actually slightly higher the improvement I get with this patch. So the
>> patch brought the latency nearly down to where it was before.
>>
>> I will try a microbenchmark to see if I get similar behavior, but I
>> wanted to throw this out here to get input.
>
> The better IO latency did bump the overall benchmark score by 3%, and it
> did end up bringing our latencies on par with solaris runs on similar
> hardware.
>
> We didn't find this one through exhaustive tracing...instead we used a more
> traditional approach involving a list of Jens' commits and a dart board.
> So, we don't have a lot of data yet on exactly why the plug is hurting.
>
> But, I'm starting to wonder if the plug makes sense here at all. We're
> queueing up IO in the main submit loop, and the aio submit might be
> spanning any number of devices on a large variety of filesystems. The
> actual direct IO call may be pretty expensive.
I believe the original plugging here was done on a per fd basis. So, I
concede that the behaviour may have changed a bit since the initial
patch for this was merged.
> My guess for why this helps is contention on the aio context lock
> between the submission code and the end_io softirq code. We bash on
> that lock a number of times during the IO submit, and the whole time
> we're holding on to our list of plugged IOs instead of giving the
> hardware the chance to process them.
I have a patch slated for 3.2 that should help that. It batches the
allocation of the aio requests, which showed a good improvement in
microbenchmarks there.
commit 080d676de095a14ecba14c0b9a91acb5bbb634df
Author: Jeff Moyer <jmoyer@...hat.com>
Date: Wed Nov 2 13:40:10 2011 -0700
aio: allocate kiocbs in batches
Cheers,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists