linux-kernel - Re: [RFC PATCH v1 0/4] CFS Bandwidth Control

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100216053946.GA3492@in.ibm.com>
Date:	Tue, 16 Feb 2010 11:09:46 +0530
From:	Bharata B Rao <bharata@...ux.vnet.ibm.com>
To:	Paul Turner <pjt@...gle.com>
Cc:	linux-kernel@...r.kernel.org, Paul Menage <menage@...gle.com>,
	Srivatsa Vaddagiri <vatsa@...ibm.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Gautham R Shenoy <ego@...ibm.com>,
	Dhaval Giani <dhaval.giani@...il.com>,
	Balbir Singh <balbir@...ux.vnet.ibm.com>,
	Herbert Poetzl <herbert@...hfloor.at>,
	Chris Friesen <cfriesen@...tel.com>,
	Avi Kivity <avi@...hat.com>, Nikhil Rao <ncrao@...gle.com>,
	Ingo Molnar <mingo@...e.hu>,
	Kamalesh Babulal <kamalesh@...ux.vnet.ibm.com>,
	Mike Waychison <mikew@...gle.com>,
	Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>,
	Pavel Emelyanov <xemul@...nvz.org>
Subject: Re: [RFC PATCH v1 0/4] CFS Bandwidth Control

On Fri, Feb 12, 2010 at 06:54:52PM -0800, Paul Turner wrote:
> 
> The skeleton of our approach is as follows:
> - As above we maintain a global pool, per-tg, pool of unassigned quota.  On it
>   we track the bandwidth period, quota per period, and runtime remaining in
>   the current period.  As bandwidth is used within a period it is decremented
>   from runtime.  Runtime is currently synchronized using a spinlock, in the
>   current implementation there's no reason this couldn't be done using
>   atomic ops instead however the spinlock allows for a little more flexibility
>   in experimentation with other schemes.
> - When a cfs_rq participating in a bandwidth constrained task_group executes
>   it acquires time in sysctl_sched_cfs_bandwidth_slice (default currently
>   10ms) size chunks from the global pool, this synchronizes under rq->lock and
>   is part of the update_curr path.
> - Throttled entities are dequeued immediately (as opposed to delaying this
>   operation to the put path), this avoids some potentially poor load-balancer
>   interactions and preserves the 'verbage' of the put_task semantic.
>   Throttled entities are gated from participating in the tree at the
>   {enqueue, dequeue}_entity level.  They are also skipped for load
>   balance in the same manner as Bharatta's patch-series employs.

I did defer the dequeue until next put because walking the se hierarchy
multiple times (from update_curr -> dequeue_entity -> update_curr) appeared
too complex when I started with it.

> 
> Interface:
> ----------
> Two new cgroupfs files are added to the cpu subsystem:
> - cpu.cfs_period_us : period over which bandwidth is to be regulated
> - cpu.cfs_quota_us  : bandwidth available for consumption per period
> 
> One important interface change that this introduces (versus the rate limits
> proposal) is that the defined bandwidth becomes an absolute quantifier.
> 
> e.g. a bandwidth of 5 seconds (cpu.cfs_quota_us=5000000) on a period of 1 second
> (cpu.cfs_period_us=1000000) would result in 5 wall seconds of cpu time being
> consumable every 1 wall second.

As I have said earlier, I would like to hear what others say about this
interface. Especially from Linux-vserver project since it is already
using the cfs hard limit patches in their test release. Herbert ?

Thanks for your work. More later when I review the individual patches
in detail.

Regards,
Bharata.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/