linux-kernel - [PATCH 0/8] cfq-iosched: Use vdisktime based scheduling logic for cfq queues [V2]

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <1349732700-2694-1-git-send-email-vgoyal@redhat.com>
Date:	Mon,  8 Oct 2012 17:44:52 -0400
From:	Vivek Goyal <vgoyal@...hat.com>
To:	linux-kernel@...r.kernel.org, axboe@...nel.dk
Cc:	vgoyal@...hat.com, jmoyer@...hat.com, tj@...nel.org
Subject: [PATCH 0/8] cfq-iosched: Use vdisktime based scheduling logic for cfq queues [V2]

Hi,

This is V2 of the patch series to use same scheduling logic for cfq
queues as we use for cfq groups. This applies on top of cfq cleanup
changes I posted here.

http://lkml.indiana.edu/hypermail/linux/kernel/1210.0/01966.html

Bot the patch series have been generated on top of 3.6 in linus tree.

Why to change scheduling algorithm
==================================
Currently we use two scheduling algorithms at two different layers.
vdisktime based algorithm for groups and round robin for cfq queues.
Now we are planning to do more development in cfqq so that it can
handle group hierarchies. And I think before we do that we first need
to change the code so that both queues and groups are treated same way
when it comes to scheduling. Otherwise the whole thing is a mess.

This patch series does not merge the queue and group scheduling code.
It just tries to make these similar enough so that merging of code
becomes easier in future patches.

What's the functionality impact
===============================
Total disk share (time slices) allocated to each prio queue should
become predictable and every queue gets its fair share of disk
in proportion to its prio/weight.

This works only if we idle on the cfq queue (rotational disk and
low end SSD). For SSD with queue depth more than certain requests,
we don't idle on queues and there will be no priority differentiation
between various queues.

In did my testing on a SATA rotational disk and lauched 8 processes
with prio 0-7, all doing sequential reads. Here are the results.

                0	1	3	4	4	5	6	7
vanilla(MB/s)  14.0     9.8    7.6     6.4     5.0     3.4     2.2     1.6
patched(MB/s)  27.5     15.2   8.0     4.8     3.1     2.1     1.3     .8

Notice that service differentiation of IO between different prio
level has significantly on this disk. I guess that's a good thing.
Roughly each prio level should get 1.6 times more time slice as
compared to previous prio level.

This is easily modifiable in code  if people find this kind of
service differentiation is too much.

Also note that total throughput of disk has increased. I think it
has happened because low prio queue gets scheduled less number
of times hence resulting in less number of seeks.

Thanks
Vivek

Vivek Goyal (8):
  cfq-iosched: Make cfq_scale_slice() usable for both queues and groups
  cfq-iosched: make new_cfqq variable bool
  cfq-iosced: Do the round robin selection of workload type
  cfq-iosched: Put new queue at the end of servie tree always
  cfq-iosched: Remove residual slice logic
  cfq-iosched: put cooperating queue at the front of service tree
  cfq-iosched: Use same scheduling algorithm for groups and queues
  cfq-iosched: Wait for queue to get busy even if this is not last
    queue in group

 block/blk-cgroup.h  |    2 +-
 block/cfq-iosched.c |  313 ++++++++++++++++++++++++++++++---------------------
 2 files changed, 187 insertions(+), 128 deletions(-)

-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/