linux-kernel - Re: [RFC][PATCH] Make io

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120724223110.GQ23387@dastard>
Date:	Wed, 25 Jul 2012 08:31:10 +1000
From:	Dave Chinner <david@...morbit.com>
To:	Ankit Jain <jankit@...e.de>
Cc:	Al Viro <viro@...iv.linux.org.uk>, bcrl@...ck.org,
	linux-fsdevel@...r.kernel.org, linux-aio@...ck.org,
	linux-kernel@...r.kernel.org, Jan Kara <jack@...e.cz>
Subject: Re: [RFC][PATCH] Make io_submit non-blocking

On Tue, Jul 24, 2012 at 05:11:05PM +0530, Ankit Jain wrote:
> 
> Currently, io_submit tries to execute the io requests on the
> same thread, which could block because of various reaons (eg.
> allocation of disk blocks). So, essentially, io_submit ends
> up being a blocking call.
> 
> With this patch, io_submit prepares all the kiocbs and then
> adds (kicks) them to ctx->run_list (kicked) in one go and then
> schedules the workqueue. The actual operations are not executed
> on io_submit's process context, so it can return very quickly.
> 
> This run_list is processed either on a workqueue or in response to
> an io_getevents call. This utilizes the existing retry infrastructure.
> 
> It uses override_creds/revert_creds to use the submitting process'
> credentials when processing the iocb request from the workqueue. This
> is required for proper support of quota and reserved block access.
> 
> Currently, we use block plugging in io_submit, since most of the IO
> was being done there itself. This patch moves it to aio_kick_handler
> and aio_run_all_iocbs, where the IO gets submitted.
> 
> All the tests were run with ext4.
> 
> I tested the patch with fio
>  (fio rand-rw-disk.fio --max-jobs=2 --latency-log
>  --bandwidth-log)
> 
> **Unpatched**
> read : io=102120KB, bw=618740 B/s, iops=151 , runt=169006msec
> slat (usec): min=275 , max=87560 , avg=6571.88, stdev=2799.57

Hmmm, I had to check the numbers twice - that's only 600KB/s.

Perhaps you need to test on something more than a single piece of
spinning rust. Optimising AIO for SSD rates (say 100k 4k write IOPS)
is probably more relevant to the majority of AIO users....

> write: io=102680KB, bw=622133 B/s, iops=151 , runt=169006msec
> slat (usec): min=2 , max=196 , avg=24.66, stdev=20.35
> 
> **Patched**
> read : io=102864KB, bw=504885 B/s, iops=123 , runt=208627msec
> slat (usec): min=0 , max=120 , avg= 1.65, stdev= 3.46 
> 
> write: io=101936KB, bw=500330 B/s, iops=122 , runt=208627msec
> slat (usec): min=0 , max=131 , avg= 1.85, stdev= 3.27 

So you made ext4 20% slower at random 4k writes with worst case
latencies only improving by about 30%. That, I think, is a
non-starter....

Also, you added a memory allocation in the io submit code. Worse
case latency will still be effectively undefined - what happens to
latencies if you generate memory pressure while the test is running?

FWIW, if you are going to change generic code, you need to present
results for other filesystems as well (xfs, btrfs are typical), as
they may not have the same problems as ext4 or react the same way to
your change. The result might simply be "it is 20% slower"....

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/