linux-kernel - Low latency I/O scheduling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4e5e476b0906081040l4a188b2fwad27ebe715b61daf@mail.gmail.com>
Date:	Mon, 8 Jun 2009 19:40:05 +0200
From:	Corrado Zoccolo <czoccolo@...il.com>
To:	Linux-Kernel <linux-kernel@...r.kernel.org>,
	Jens Axboe <jens.axboe@...cle.com>
Subject: Low latency I/O scheduling

Hi,
sometime ago on the mailing list, the issue of too high latencies
introduced by I/O scheduling was raised.
I've done some study, and identified a way to improve on workloads
that have >1 random readers.

The problem:
CFQ partitions the requests into queues, so that requests coming from
different processes are in different queues.
Each queue will get its timeslice. If there are many processes, the
number of queues will increase linearly, and the I/O latency will as
well.
This works well if the processes are doing sequential I/O, in fact in
the timeslice, subsequent requests will complete much faster. But for
random I/O, we introduce too much latency, without getting back
bandwidth improvement.

The solution:
Divide the requests in two classes. Sequential requests are put in
per-process queues, while random requests are put in a shared data
structure (e.g. RB-tree), and we alternate between serving the
sequential queues and the random data structure.

A proof-of-concept implementation of the idea, based on AS (called
'fas' = Fair Anticipatory Scheduling in the following), is provided in
the attached file. I conducted some tests (attached the fio
configuration to replicate them), and I got the following
improvements:

2 parallel readers (both random) test:
* fas has lowest max and avg latencies, even lower than noop.

4 random readers/1 seq writer test:
* max latency is lowest for fas (218ms, compared with 617ms of CFQ).
CFQ has slightly lower avg latency, probably due to different
bandwidth allocation between readers and writer.

32 random readers/1 seq writer test:
* fas has lowest max and avg latency: 1.314s and 304ms resp., compared
with CFQ: 7.63s, 391ms resp. (while still providing higher bandwidth
both to the readers and the writer than CFQ).

When < 2 random readers are present, the heuristics doesn't matter,
but the tests show slight performance differences with CFQ, mainly due
to different time shares allocations. In particular, one relevant
difference is that time allocated to each process performing
sequential I/O is inversely proportional to the number of those
processes (but has a guaranteed minimum), and balance between
sequential and random I/O is done by varying the total time allowed
for sequential processes.


All test results data follows (all tables are sorted by bandiwdth):

Sequential read test:
BW (KiB/s), max latency (ms), avg latency (ms)
noop: 32457, 18, 0.124
deadline: 31746, 21, 0.127
as: 31014, 18, 0.13
fas: 30229, 18, 0.134
cfq: 30138, 18, 0.134

Sequential write test:
BW (KiB/s), max latency (ms), avg latency (ms)
as: 28082, 473, 0.132
noop: 27851, 283, 0.134
fas: 27417, 309, 0.136
deadline: 26971, 1924, 0.138
cfq: 26626, 414, 0.132

2 parallel readers (both sequential) test:
Aggr BW (KiB/s), max latency (ms), avg latency (ms)
fas: 31924 , 311, 0.255
cfq: 31916 , 150, 0.255
as: 31863  , 260, 0.255
noop: 20690, 64, 0.394
deadline: 20196, 64, 0.404

2 parallel readers (1 seq, 1 random):
Random BW (KiB/s), Seq BW (KiB/s), max latency (ms), avg latency (ms)
fas: 195, 16681, 186, 10
cfq: 191, 16129, 152, 10
as: 95, 24611, 269, 21
deadline: 75, 13581, 101, 27
noop: 72, 13240, 102, 28

2 parallel writers (both sequential) test:
Aggr BW (KiB/s), max latency (ms), avg latency (ms)
noop: 25979, 573, 0.286
fas: 25972, 1264, 0.266
as: 25131, 1025, 0.2745
deadline: 25086, 899, 0.276
cfq: 23469, 1066, 0.311

2 parallel readers (both random) test:
Aggr BW (KiB/s), max latency (ms), avg latency (ms)
as: 477, 173, 17
fas: 458, 39, 17
deadline: 456, 48, 17
cfq: 451, 136, 18
noop: 450, 44, 18

4 random readers/1 seq writer test:
READ: Aggr BW (KiB/s), max latency (ms), avg latency (ms)
cfq: 425, 617, 30
fas: 402, 218, 32
deadline: 389, 235, 33
as: 361, 340, 36
noop: 96, 400, 135
WRITE: BW (KiB/s), max latency (ms), avg latency (ms)
noop: 21920, 354, 0.037
as: 10014, 3493, 0.0814
deadline: 7665, 5383, 0.1064
fas: 5669, 10599, 0.144
cfq: 4009, 15076, 0.2038

32 random readers/1 seq writer test:
READ: Aggr BW (KiB/s), max latency (ms), avg latency (ms)
fas: 414, 1314, 304
deadline: 363, 1704, 345
cfq: 326, 7630, 391
as: 319, 1919, 395
noop: 66, 2132, 1526
WRITE: BW (KiB/s), max latency (ms), avg latency (ms)
noop: 19839, 784, 0.0053
as: 18302, 8534, 0.0059
fas: 4416, 23507, 0.025
cfq: 3792, 16797, 0.026
deadline: 2484, 5629, 0.042

-- 
__________________________________________________________________________

dott. Corrado Zoccolo                          mailto:czoccolo@...il.com
PhD - Department of Computer Science - University of Pisa, Italy
--------------------------------------------------------------------------

View attachment "fas-iosched.c" of type "text/x-csrc" (38599 bytes)

Download attachment "test2.fio" of type "application/octet-stream" (6067 bytes)