[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPM31RLtNnJqX00vTQ6PbUcGRhNufbvb1OdGkXs4i2+=4=A5eA@mail.gmail.com>
Date: Mon, 27 Jun 2011 18:42:11 -0700
From: Paul Turner <pjt@...gle.com>
To: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc: linux-kernel@...r.kernel.org,
Bharata B Rao <bharata@...ux.vnet.ibm.com>,
Dhaval Giani <dhaval.giani@...il.com>,
Balbir Singh <balbir@...ux.vnet.ibm.com>,
Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>,
Srivatsa Vaddagiri <vatsa@...ibm.com>,
Kamalesh Babulal <kamalesh@...ux.vnet.ibm.com>,
Hidetoshi Seto <seto.hidetoshi@...fujitsu.com>,
Ingo Molnar <mingo@...e.hu>, Pavel Emelyanov <xemul@...nvz.org>
Subject: Re: [patch 15/16] sched: return unused runtime on voluntary sleep
On Thu, Jun 23, 2011 at 8:26 AM, Peter Zijlstra <a.p.zijlstra@...llo.nl> wrote:
> On Tue, 2011-06-21 at 00:17 -0700, Paul Turner wrote:
>> plain text document attachment (sched-bwc-simple_return_quota.patch)
>> When a local cfs_rq blocks we return the majority of its remaining quota to the
>> global bandwidth pool for use by other runqueues.
>
> OK, I saw return_cfs_rq_runtime() do that.
>
>> We do this only when the quota is current and there is more than
>> min_cfs_rq_quota [1ms by default] of runtime remaining on the rq.
>
> sure..
>
>> In the case where there are throttled runqueues and we have sufficient
>> bandwidth to meter out a slice, a second timer is kicked off to handle this
>> delivery, unthrottling where appropriate.
>
> I'm having trouble there, what's the purpose of the timer, you could
> redistribute immediately. None of this is well explained.
>
Current reasons:
- There was concern regarding thrashing the unthrottle path on a task
that is rapidly oscillating between runnable states, using a timer
this operation is inherently limited both in frequency and to a single
cpu. I think the move to using a throttled list (as opposed to having
to poll all cpus) as well as the fact that we only return quota in
excess of min_cfs_rq_quota probably mitigates this to the point where
we could just do away with this and do it directly in the put path.
- The aesthetics of releasing rq->lock in the put path. Quick
inspection suggests it should actually be safe to do at that point,
and we do similar for idle_balance().
Given consideration the above two factors are not requirements, this
could be moved out of a timer and into the put_path directly (with the
fact that we drop rq->lock strongly commented). I have no strong
preference between either choice.
Uninteresting additional historical reason:
The /original/ requirement for a timer here is that previous versions
placed some of the bandwidth distribution under cfs_b->lock. This
meant that we couldn't take rq->lock under cfs_b->lock (as the nesting
is the other way around). This is no longer a requirement
(advancement of expiration now provides what cfs_b->lock used to
here).
A timer is used so that we don't have to release rq->lock within the put path
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists