lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 4 Nov 2009 10:41:35 -0500
From:	Vivek Goyal <vgoyal@...hat.com>
To:	Jeff Moyer <jmoyer@...hat.com>
Cc:	linux-kernel@...r.kernel.org, jens.axboe@...cle.com,
	nauman@...gle.com, dpshah@...gle.com, lizf@...fujitsu.com,
	ryov@...inux.co.jp, fernando@....ntt.co.jp, s-uchida@...jp.nec.com,
	taka@...inux.co.jp, guijianfeng@...fujitsu.com,
	balbir@...ux.vnet.ibm.com, righi.andrea@...il.com,
	m-ikeda@...jp.nec.com, akpm@...ux-foundation.org, riel@...hat.com,
	kamezawa.hiroyu@...fujitsu.com
Subject: Re: [PATCH 03/20] blkio: Introduce the notion of weights

On Wed, Nov 04, 2009 at 10:06:16AM -0500, Jeff Moyer wrote:
> Vivek Goyal <vgoyal@...hat.com> writes:
> 
> > o Introduce the notion of weights. Priorities are mapped to weights internally.
> >   These weights will be useful once IO groups are introduced and group's share
> >   will be decided by the group weight.
> 
> I'm sorry, but I need more background to review this patch.  Where do
> the min and max come from?  Why do you scale 7-0 from 200-900?  How does
> this map to what was there before (exactly, approximately)?
> 

Well, So far we only have the notion of iopriority for the process and
based on that we determine time slice length.

Soon we will throw cfq groups also in the mix. Because cpu IO controller
is weight driven, people have shown preference that group's share should
be decided based on its weight and not introduce the notion of ioprio for
groups.

So now core scheduling algorithm only recognizes weights for entities (be it
cfq queues or cfq groups), and it is required that we convert the ioprio
of cfqq into weight.

Now it is a matter of coming up with what weight range do we support and
how ioprio should be mapped onto these weights. We can always change the
mappings but to being with, I have followed following.

Allow a weight range from 100 to 1000. Allowing too small a weights like
"1", can lead to very interesting corner cases and I wanted to avoid that
in first implementation. For example, if some group with weight "1" gets
a time slice of 100ms, its vtime will be really high and after that it
will not get scheduled in for a very long time.

Seconly allowing too small a weights can make vtime of the tree move very
fast with faster wrap around of min_vdistime. (especially on SSD where idling
might not be enabled, and for every queue expiry we will attribute minimum of
1ms of slice. If weight of the group is "1" it will higher vtime and
min_vdisktime will move very fast). We don't want too fast a wrap around
of min_vdisktime (especially in case of idle tree. That infrastructure is
not part of current patches).

Hence, to begin with I wanted to limit the range of weights allowed because
wider range opens up lot of interesting corner cases. That's why limited
minimum weight to 100. So at max user can expect the 1000/100=10 times service
differentiation between highest and lower weight groups. If folks need more
than that, we can look into it once things stablize.

Priority and weights follow reverse order. Higher priority means low
weight and vice-versa.

Currently we support 8 priority levels and prio "4" is the middle point.
Anything higher than prio 4 gets 20% less slice as compared to prio 4 and
priorities lower than 4, get 20% higher slice of prio 4 (20% higher/lower
for each priority level).

For weight range 100 - 1000, 500 can be considered as mid point. Now this
is how priority mapping looks like.

	100 200 300 400 500 600 700 800 900 1000  (Weights) 
	     7   6   5   4   3   2  1   0         (io prio).

Once priorities are converted to weights, we are able to retain the notion
of 20% difference between prio levels by choosing 500 as the mid point and
mapping prio 0-7 to weights 900-200, hence this mapping. 

I am all ears if you have any suggestions on how this ca be handled
better.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ