[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20110216031831.571628191@google.com>
Date: Tue, 15 Feb 2011 19:18:31 -0800
From: Paul Turner <pjt@...gle.com>
To: linux-kernel@...r.kernel.org
Cc: Bharata B Rao <bharata@...ux.vnet.ibm.com>,
Dhaval Giani <dhaval@...ux.vnet.ibm.com>,
Balbir Singh <balbir@...ux.vnet.ibm.com>,
Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>,
Gautham R Shenoy <ego@...ibm.com>,
Srivatsa Vaddagiri <vatsa@...ibm.com>,
Kamalesh Babulal <kamalesh@...ux.vnet.ibm.com>,
Ingo Molnar <mingo@...e.hu>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Pavel Emelyanov <xemul@...nvz.org>,
Herbert Poetzl <herbert@...hfloor.at>,
Avi Kivity <avi@...hat.com>,
Chris Friesen <cfriesen@...tel.com>
Subject: [CFS Bandwidth Control v4 0/7] Introduction
Hi all,
Please find attached v4 of CFS bandwidth control; while this rebase against
some of the latest SCHED_NORMAL code is new, the features and methodology are
fairly mature at this point and have proved both effective and stable for
several workloads.
As always, all comments/feedback welcome.
Changes since v3:
- Rebased to current tip, update to work with new group scheduling accounting
- (Bug fix) Fixed Race with unthrottling (due to changing global limit) fixed
- (Bug fix) Fixed buddy interactions -- in particular, prevent buddy
nominations from re-picking throttled entities
The skeleton of our approach is as follows:
- We maintain a global pool (per-tg) pool of unassigned quota. Within it
we track the bandwidth period, quota per period, and runtime remaining in
the current period. As bandwidth is used within a period it is decremented
from runtime. Runtime is currently synchronized using a spinlock, in the
current implementation there's no reason this couldn't be done using
atomic ops instead however the spinlock allows for a little more flexibility
in experimentation with other schemes.
- When a cfs_rq participating in a bandwidth constrained task_group executes
it acquires time in sysctl_sched_cfs_bandwidth_slice (default currently
10ms) size chunks from the global pool, this synchronizes under rq->lock and
is part of the update_curr path.
- Throttled entities are dequeued, we protect against their re-introduction to
the scheduling hierarchy via checking for a, per cfs_rq, throttled bit.
Interface:
----------
Three new cgroupfs files are exported by the cpu subsystem:
cpu.cfs_period_us : period over which bandwidth is to be regulated
cpu.cfs_quota_us : bandwidth available for consumption per period
cpu.stat : statistics (such as number of throttled periods and
total throttled time)
One important interface change that this introduces (versus the rate limits
proposal) is that the defined bandwidth becomes an absolute quantifier.
Previous postings:
-----------------
v3:
https://lkml.org/lkml/2010/10/12/44
v2:
http://lkml.org/lkml/2010/4/28/88
Original posting:
http://lkml.org/lkml/2010/2/12/393
Prior approaches:
http://lkml.org/lkml/2010/1/5/44 ("CFS Hard limits v5")
Thanks,
- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists