lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090417134924.GC29086@redhat.com>
Date:	Fri, 17 Apr 2009 09:49:24 -0400
From:	Vivek Goyal <vgoyal@...hat.com>
To:	Dhaval Giani <dhaval@...ux.vnet.ibm.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>, nauman@...gle.com,
	dpshah@...gle.com, lizf@...fujitsu.com, mikew@...gle.com,
	fchecconi@...il.com, paolo.valente@...more.it,
	jens.axboe@...cle.com, ryov@...inux.co.jp,
	fernando@...ellilink.co.jp, s-uchida@...jp.nec.com,
	taka@...inux.co.jp, guijianfeng@...fujitsu.com,
	arozansk@...hat.com, jmoyer@...hat.com, oz-kernel@...hat.com,
	balbir@...ux.vnet.ibm.com, linux-kernel@...r.kernel.org,
	containers@...ts.linux-foundation.org, menage@...gle.com,
	peterz@...radead.org
Subject: IO Controller discussion (Was: Re: [PATCH 01/10] Documentation)

On Fri, Apr 17, 2009 at 11:05:17AM +0530, Dhaval Giani wrote:
> On Thu, Apr 16, 2009 at 02:37:53PM -0400, Vivek Goyal wrote:
> > On Wed, Apr 08, 2009 at 10:37:59PM +0200, Andrea Righi wrote:
> > 
> > [..]
> > > > 
> > > > - I can think of atleast one usage of uppper limit controller where we
> > > >   might have spare IO resources still we don't want to give it to a
> > > >   cgroup because customer has not paid for that kind of service level. In
> > > >   those cases we need to implement uppper limit also.
> > > > 
> > > >   May be prportional weight and max bw controller can co-exist depending
> > > >   on what user's requirements are.
> > > >  
> > > >   If yes, then can't this control be done at the same layer/level where
> > > >   proportional weight control is being done? IOW, this set of patches is
> > > >   trying to do prportional weight control at IO scheduler level. I think
> > > >   we should be able to store another max rate as another feature in 
> > > >   cgroup (apart from weight) and not dispatch requests from the queue if
> > > >   we have exceeded the max BW as specified by the user?
> > > 
> > > The more I think about a "perfect" solution (at least for my
> > > requirements), the more I'm convinced that we need both functionalities.
> > > 
> 
> hard limits vs work conserving argument again :). I agree, we need
> both of the functionalities. I think first the aim should be to get the
> proportional weight functionality and then look at doing hard limits.
> 

Agreed.

> [..]
> 
> > > > 
> > > > - Have you thought of doing hierarchical control? 
> > > > 
> > > 
> > > Providing hiearchies in cgroups is in general expensive, deeper
> > > hierarchies imply checking all the way up to the root cgroup, so I think
> > > we need to be very careful and be aware of the trade-offs before
> > > providing such feature. For this particular case (IO controller)
> > > wouldn't it be simpler and more efficient to just ignore hierarchies in
> > > the kernel and opportunely handle them in userspace? for absolute
> > > limiting rules this isn't difficult at all, just imagine a config file
> > > and a script or a deamon that dynamically create the opportune cgroups
> > > and configure them accordingly to what is defined in the configuration
> > > file.
> > > 
> > > I think we can simply define hierarchical dependencies in the
> > > configuration file, translate them in absolute values and use the
> > > absolute values to configure the cgroups' properties.
> > > 
> > > For example, we can just check that the BW allocated for a particular
> > > parent cgroup is not greater than the total BW allocated for the
> > > children. And for each child just use the min(parent_BW, BW) or equally
> > > divide the parent's BW among the children, etc.
> > 
> > IIUC, you are saying that allow hiearchy in user space and then flatten it
> > out and pass it to kernel?
> > 
> > Hmm.., agree that handling hierarchies is hard and expensive. But at the
> > same time rest of the controllers like cpu and memory are handling it in
> > kernel so it probably makes sense to keep the IO controller also in line.
> > 
> > In practice I am not expecting deep hiearchices. May be 2- 3 levels would
> > be good for most of the people.
> > 
> 
> FWIW, even in the CPU controller having deep hierarchies is not a good idea.
> I think this can be documented for IO Controller as well. Beyond that,
> we realized that having a proportional system and doing it in userspace
> is not a good idea. It would require a lot of calculations dependending
> on the system load. (Because, the sub-group should be just the same as a
> process in the parent group). Having hierarchy in the kernel just makes it way
> more easier and way more accurate.

Agreed. I will prefer to keep hierarchical support in kernel inline with
other controllers.

> 
> > > 
> > > > - What happens to the notion of CFQ task classes and task priority. Looks
> > > >   like max bw rule supercede everything. There is no way that an RT task
> > > >   get unlimited amount of disk BW even if it wants to? (There is no notion
> > > >   of RT cgroup etc)
> > > 
> > > What about moving all the RT tasks in a separate cgroup with unlimited
> > > BW?
> > 
> > Hmm.., I think that should work. I have yet to look at your patches in
> > detail but it looks like unlimited BW group will not be throttled at all
> > hence RT tasks can just go right through without getting impacted.
> > 
> 
> This is where the cpu scheduler design helped a lot :). Having different
> classes for differnet types of processes allowed us to handle them
> separately.

In common layer scheduling approach, we do have separate classes (RT, BE
and IDLE) and scheduling is done accordingly. Code primarily taken fro
bfq and cfq.

dm-ioband has no notion of separate classes and everything was being
treated at same level which is a problem as end level IO scheduler will
loose its capability to differentiate we mixup he things above it.

Time to play with max bw controller patches and then I can probably have
more insights into it.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ