linux-kernel - Re: [patch 15/16] sched: return unused runtime on voluntary sleep

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 27 Jun 2011 18:42:11 -0700
From:	Paul Turner <pjt@...gle.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	linux-kernel@...r.kernel.org,
	Bharata B Rao <bharata@...ux.vnet.ibm.com>,
	Dhaval Giani <dhaval.giani@...il.com>,
	Balbir Singh <balbir@...ux.vnet.ibm.com>,
	Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>,
	Srivatsa Vaddagiri <vatsa@...ibm.com>,
	Kamalesh Babulal <kamalesh@...ux.vnet.ibm.com>,
	Hidetoshi Seto <seto.hidetoshi@...fujitsu.com>,
	Ingo Molnar <mingo@...e.hu>, Pavel Emelyanov <xemul@...nvz.org>
Subject: Re: [patch 15/16] sched: return unused runtime on voluntary sleep

On Thu, Jun 23, 2011 at 8:26 AM, Peter Zijlstra <a.p.zijlstra@...llo.nl> wrote:
> On Tue, 2011-06-21 at 00:17 -0700, Paul Turner wrote:
>> plain text document attachment (sched-bwc-simple_return_quota.patch)
>> When a local cfs_rq blocks we return the majority of its remaining quota to the
>> global bandwidth pool for use by other runqueues.
>
> OK, I saw return_cfs_rq_runtime() do that.
>
>> We do this only when the quota is current and there is more than
>> min_cfs_rq_quota [1ms by default] of runtime remaining on the rq.
>
> sure..
>
>> In the case where there are throttled runqueues and we have sufficient
>> bandwidth to meter out a slice, a second timer is kicked off to handle this
>> delivery, unthrottling where appropriate.
>
> I'm having trouble there, what's the purpose of the timer, you could
> redistribute immediately. None of this is well explained.
>

Current reasons:
- There was concern regarding thrashing the unthrottle path on a task
that is rapidly oscillating between runnable states, using a timer
this operation is inherently limited both in frequency and to a single
cpu.  I think the move to using a throttled list (as opposed to having
to poll all cpus) as well as the fact that we only return quota in
excess of min_cfs_rq_quota probably mitigates this to the point where
we could just do away with this and do it directly in the put path.

- The aesthetics of releasing rq->lock in the put path.  Quick
inspection suggests it should actually be safe to do at that point,
and we do similar for idle_balance().

Given consideration the above two factors are not requirements, this
could be moved out of a timer and into the put_path directly (with the
fact that we drop rq->lock strongly commented).  I have no strong
preference between either choice.

Uninteresting additional historical reason:
The /original/ requirement for a timer here is that previous versions
placed some of the bandwidth distribution under cfs_b->lock.  This
meant that we couldn't take rq->lock under cfs_b->lock (as the nesting
is the other way around).  This is no longer a requirement
(advancement of expiration now provides what cfs_b->lock used to
here).




A timer is used so that we don't have to release rq->lock within the put path
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/