lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 24 Feb 2011 18:20:52 +0100
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	bharata@...ux.vnet.ibm.com
Cc:	Paul Turner <pjt@...gle.com>, linux-kernel@...r.kernel.org,
	Dhaval Giani <dhaval.giani@...il.com>,
	Balbir Singh <balbir@...ux.vnet.ibm.com>,
	Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>,
	Srivatsa Vaddagiri <vatsa@...ibm.com>,
	Kamalesh Babulal <kamalesh@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...e.hu>,
	Pavel Emelyanov <xemul@...nvz.org>,
	Herbert Poetzl <herbert@...hfloor.at>,
	Avi Kivity <avi@...hat.com>,
	Chris Friesen <cfriesen@...tel.com>,
	Nikhil Rao <ncrao@...gle.com>
Subject: Re: [CFS Bandwidth Control v4 3/7] sched: throttle cfs_rq entities
 which exceed their local quota

On Thu, 2011-02-24 at 22:09 +0530, Bharata B Rao wrote:
> On Thu, Feb 24, 2011 at 04:52:53PM +0100, Peter Zijlstra wrote:
> > On Thu, 2011-02-24 at 21:15 +0530, Bharata B Rao wrote:
> > > While I admit that our load balancing semantics wrt thorttled entities are
> > > not consistent (we don't allow pulling of tasks directly from throttled
> > > cfs_rqs, while allow pulling of tasks from a throttled hierarchy as in the
> > > above case), I am beginning to think if it works out to be advantageous.
> > > Is there a chance that the task gets to run on other CPU where the hierarchy
> > > isn't throttled since runtime is still available ? 
> > 
> > Possible yes, but the load-balancer doesn't know about that, not should
> > it (its complicated, and broken, enough, no need to add more cruft to
> > it).
> > 
> > I'm starting to think you all should just toss all this and start over,
> > its just too smelly.
> 
> Hmm... You have brought up 3 concerns:
> 
> 1. Hierarchy semantics
> 
> If you look at the heirarchy semantics we currently have while ignoring the
> load balancer interactions for a moment, I guess what we have is a reasonable
> one.
> 
> - Only group entities are throttled
> - Throttled entities are taken off the runqueue and hence they never
>   get picked up for scheduling.
> - New or child entites are queued up to the throttled entities and not
>   further up. As I said in another thread, having the tree intact and correct
>   underneath the throttled entity allows us to rebuild the hierarchy during
>   unthrottling with least amount of effort.

It also gets you into all that load-balancer mess, and I'm not going to
let you off lightly there.

> - Group entities in a hierarchy are throttled independent of each other based
>   on their bandwidth specification.

That's missing out quite a few details.. for one there is no mention of
hierarchical implication of/constraints on bandwidth, can children have
more bandwidth than their parent (I hope not).

> 2. Handling of throttled entities by load balancer
> 
> This definetely needs to improve and be more consistent. We can work on this.

Feh, improve is being nice about it, it needs a complete overhaul, the
current situation is a cobbled together leaky mess.

> 3. per-cgroup vs global period specification
> 
> I thought per-cgroup specification would be most flexible and hence started
> out with that. This would allow groups/workloads/VMs to define their
> own bandwidth rate.

Most flexible yes, most 'interesting' too, now if you consider running a
child task is also running the parent entity and therefore you're
consuming bandwidth up the entire hierarchy, what happens when the
parent has a much larger period than the child?

In that case your child doesn't get ran while the parent is throttled,
and the child's period is violated.


> Let us know if you have other design concerns besides these.

Yeah, that weird time accounting muck, bandwidth should decrease on
usage and incremented on replenishment, this gets you 0 as the natural
boundary between credit and debt, no need to keep two variables.

Also, the above just about covers all the patch set does, isn't that
enough justification to throw the thing out and start over?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ