linux-kernel - Re: [PATCH v3 2/7] sched: accumulate per-cfs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <AANLkTinJ_YyzPKZXwALoFyWTzipwd0MAYhJ1c0CBjQcS@mail.gmail.com>
Date:	Thu, 14 Oct 2010 02:27:02 -0700
From:	Paul Turner <pjt@...gle.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	bharata@...ux.vnet.ibm.com, linux-kernel@...r.kernel.org,
	Dhaval Giani <dhaval.giani@...il.com>,
	Balbir Singh <balbir@...ux.vnet.ibm.com>,
	Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>,
	Srivatsa Vaddagiri <vatsa@...ibm.com>,
	Kamalesh Babulal <kamalesh@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...e.hu>,
	Pavel Emelyanov <xemul@...nvz.org>,
	Herbert Poetzl <herbert@...hfloor.at>,
	Avi Kivity <avi@...hat.com>,
	Chris Friesen <cfriesen@...tel.com>,
	Paul Menage <menage@...gle.com>,
	Mike Waychison <mikew@...gle.com>,
	Nikhil Rao <ncrao@...gle.com>
Subject: Re: [PATCH v3 2/7] sched: accumulate per-cfs_rq cpu usage

On Thu, Oct 14, 2010 at 2:19 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Tue, 2010-10-12 at 13:21 +0530, Bharata B Rao wrote:
>> +#ifdef CONFIG_CFS_BANDWIDTH
>> +       {
>> +               .procname       = "sched_cfs_bandwidth_slice_us",
>> +               .data           = &sysctl_sched_cfs_bandwidth_slice,
>> +               .maxlen         = sizeof(unsigned int),
>> +               .mode           = 0644,
>> +               .proc_handler   = proc_dointvec_minmax,
>> +               .extra1         = &one,
>> +       },
>> +#endif
>
> So this is basically your scalability knob.. the larger this value less
> less frequent we have to access global state, but the less parallelism
> is possible due to fewer CPUs depleting the total quota, leaving nothing
> for the others.
>

Exactly

> I guess one could go try and play load-balancer games to try and
> mitigate this by pulling this group's tasks to the CPU(s) that have move
> bandwidth for that group, but balancing that against the regular
> load-balancer goal of well balancing load, will undoubtedly be
> 'interesting'...
>

I considered this approach as an alternative previously, but I don't
think it can be enacted effectively:

Since quota will likely expire in a staggered fashion you're going to
get a funnel-herd effect as everything is crowded onto the cpus with
remaining quota.

It's much more easily avoided by keeping the slice small enough
(relative to the bandwidth period) that we're not potentially
stranding a significant percentage of our quota.  The potential for
abuse could be eliminated/reduced here by making the slice size a
constant ratio relative to the period length.  This would also make
possible parallelism more deterministic.

I also think versioning the quota so that it can be potentially
returned and redistributed on sleep is more effective/efficient in
avoiding stranded quota.

>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/