linux-kernel - Re: [PATCH v3 0/7] CFS Bandwidth Control

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101013054412.GL30810@MAIL.13thfloor.at>
Date:	Wed, 13 Oct 2010 07:44:12 +0200
From:	Herbert Poetzl <herbert@...hfloor.at>
To:	Bharata B Rao <bharata@...ux.vnet.ibm.com>
Cc:	linux-kernel@...r.kernel.org,
	Dhaval Giani <dhaval.giani@...il.com>,
	Balbir Singh <balbir@...ux.vnet.ibm.com>,
	Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>,
	Srivatsa Vaddagiri <vatsa@...ibm.com>,
	Kamalesh Babulal <kamalesh@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Pavel Emelyanov <xemul@...nvz.org>,
	Avi Kivity <avi@...hat.com>,
	Chris Friesen <cfriesen@...tel.com>,
	Paul Menage <menage@...gle.com>,
	Mike Waychison <mikew@...gle.com>,
	Paul Turner <pjt@...gle.com>, Nikhil Rao <ncrao@...gle.com>
Subject: Re: [PATCH v3 0/7] CFS Bandwidth Control

On Tue, Oct 12, 2010 at 01:19:10PM +0530, Bharata B Rao wrote:
> Hi,

> Its been a while since we posted CPS hard limits (aka CFS bandwidth 

Indeed, will see that I can incorporate those in the
next experimental Linux-VServer patch for testing ...

btw, is it planned to allow for hard limits which
are temporarily disabled when the machine/cpu would
be otherwise idle (i.e. running the idle thread) or
as we solved it, can we artificially advance the
time (for the hard limits) when idle so that contexts
which have work to do can work without sacrificing
the priorization or the actual limits?

best,
Herbert

PS: from a quick glance I take it that using large
values for period and quota, while keeping the ratio
the same allows for 'burst loads'?

> control now) patches, hence a quick recap first:

> - I have been working on CFS hard limits since last year and have posted
>   a few versions of the same (last post: http://lkml.org/lkml/2010/1/5/44)
> - Paul Turner and Nikhil Rao meanwhile started working on CFS bandwidth
>   control and have posted a couple of versions.
>   (last post v2: http://lwn.net/Articles/385055/)
> 
> Paul's approach mainly changed the way the CPU hard limit was represented. After
> his post, I decided to work with them and discontinue my patch series since
> their global bandwidth specification for group appears more flexible than
> the RT-type per-cpu bandwidth specification I had in my series.
> 
> Since Paul seems to be busy, I am taking the freedom of posting the next
> version of his patches with a few enhancements to the slack time handling.
> (more on this later)
> 
> Main changes in this post:
> 
> - Return the unused and remaining local quota at each CPU to the global
>   runtime pool.
> - A few fixes:
> 	- Explicitly wake up the idle cpu during unthrottle.
> 	- Optimally handle throttling of current task within enqueue_task.
> 	- Fix compilation break with CONFIG_PREEMPT on.
> 	- Fix a compilation break at intermediate patch level.
> - Applies on 2.6.36-rc7.
> 
> More about slack time issue
> ---------------------------
> Bandwidth available to a group is specified by two parameters: cpu.cfs_quota_us
> and cpu.cfs_period_us. cpu.cfs_quota_us is the max CPU time a group can
> consume within the time period of cpu.cfs_period_us. The quota available
> to a group within a period is maintained in a per-group global pool. In each
> CPU, the consumption happens by obtaining a slice of this global pool.
> 
> If the local quota (obtained as slices of global pool) isn't fully consumed
> within a given period, a group can potentially get more CPU time than
> its allowed for in the next interval. This is due to the slack time that may
> be left over from the previous interval. More details about how this is fixed
> is present in the description part of patch 7/7. Here I will only show the
> benefit of handling the slack time correctly through this experiment:
> 
> On a 16 CPU system, two different kinds of tasks were run as part of a group
> which had quota/bandwidth as 500000/500000 [=> 500ms/500ms], which means that
> the group was capped at 1CPU worth of time every period.
> 
> Type A task: Complete CPU hog.
> Type B task: Sleeps for 500ms and runs as CPU hog for next 500ms. And this cycle
> 		repeats.
> 
> 1 task of type A and 15 tasks of type B were run for 20s, each bound to a
> different CPU. At the end of 20s, the CPU time obtained by each of them
> looked like this:
> 
> -----------------------------------------------------------------------
> 			Without returning	Returning slack time
> 			slack time to global	to global pool
> 			pool			(with patch 7/7)
> -----------------------------------------------------------------------
> 1 type A task		7.96s			10.71s
> 15 type B tasks		12.36s			9.79s
> -----------------------------------------------------------------------
> 
> This shows the effects of slack time and the benefit of handling it correctly.
> 
> I request the scheduler maintainers and others for comments on these patches.
> 
> Regards,
> Bharata.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/