linux-kernel - Re: [PATCH] AIO: Don't plug the I/O queue in do_io

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:	Fri, 16 Dec 2011 09:45:07 -0500
From:	Jeff Moyer <jmoyer@...hat.com>
To:	Chris Mason <chris.mason@...cle.com>
Cc:	Dave Kleikamp <dave.kleikamp@...cle.com>, linux-aio@...ck.org,
	linux-kernel@...r.kernel.org, Jens Axboe <axboe@...nel.dk>,
	Andi Kleen <ak@...ux.intel.com>
Subject: Re: [PATCH] AIO: Don't plug the I/O queue in do_io_submit()

Chris Mason <chris.mason@...cle.com> writes:

> On Tue, Dec 13, 2011 at 05:26:07PM -0600, Dave Kleikamp wrote:
>> On 12/13/2011 04:18 PM, Jeff Moyer wrote:
>> > Dave Kleikamp <dave.kleikamp@...cle.com> writes:
>> > 
>> >> Asynchronous I/O latency to a solid-state disk greatly increased
>> >> between the 2.6.32 and 3.0 kernels. By removing the plug from
>> >> do_io_submit(), we observed a 34% improvement in the I/O latency.
>> >>
>> >> Unfortunately, at this level, we don't know if the request is to
>> >> a rotating disk or not.
>> > 
>> > I'm guessing I know the answer to this, but what workload were you
>> > testing, and can you provide more concrete evidence than "latency
>> > greatly increased?"
>> 
>> It is a piece of a larger industry-standard benchmark and you're
>> probably guessing correctly. The "greatly increased" latency was
>> actually slightly higher the improvement I get with this patch. So the
>> patch brought the latency nearly down to where it was before.
>> 
>>  I will try a microbenchmark to see if I get similar behavior, but I
>> wanted to throw this out here to get input.
>
> The better IO latency did bump the overall benchmark score by 3%, and it
> did end up bringing our latencies on par with solaris runs on similar
> hardware.
>
> We didn't find this one through exhaustive tracing...instead we used a more
> traditional approach involving a list of Jens' commits and a dart board.
> So, we don't have a lot of data yet on exactly why the plug is hurting.
>
> But, I'm starting to wonder if the plug makes sense here at all.  We're
> queueing up IO in the main submit loop, and the aio submit might be
> spanning any number of devices on a large variety of filesystems.  The
> actual direct IO call may be pretty expensive.

I believe the original plugging here was done on a per fd basis.  So, I
concede that the behaviour may have changed a bit since the initial
patch for this was merged.

> My guess for why this helps is contention on the aio context lock
> between the submission code and the end_io softirq code.  We bash on
> that lock a number of times during the IO submit, and the whole time
> we're holding on to our list of plugged IOs instead of giving the
> hardware the chance to process them.

I have a patch slated for 3.2 that should help that.  It batches the
allocation of the aio requests, which showed a good improvement in
microbenchmarks there.

commit 080d676de095a14ecba14c0b9a91acb5bbb634df
Author: Jeff Moyer <jmoyer@...hat.com>
Date:   Wed Nov 2 13:40:10 2011 -0700

    aio: allocate kiocbs in batches

Cheers,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/