linux-kernel - Re: [ISSUE] sched/cgroup: Does cpu-cgroup still works fine nowadays?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140515090638.GI30445@twins.programming.kicks-ass.net>
Date:	Thu, 15 May 2014 11:06:38 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Michael wang <wangyun@...ux.vnet.ibm.com>
Cc:	Rik van Riel <riel@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...nel.org>, Mike Galbraith <efault@....de>,
	Alex Shi <alex.shi@...aro.org>, Paul Turner <pjt@...gle.com>,
	Mel Gorman <mgorman@...e.de>,
	Daniel Lezcano <daniel.lezcano@...aro.org>
Subject: Re: [ISSUE] sched/cgroup: Does cpu-cgroup still works fine nowadays?

On Thu, May 15, 2014 at 04:46:28PM +0800, Michael wang wrote:
> On 05/15/2014 04:35 PM, Peter Zijlstra wrote:
> > On Thu, May 15, 2014 at 11:46:06AM +0800, Michael wang wrote:
> >> But for the dbench, stress combination, that's not spin-wasted, dbench
> >> throughput do dropped, how could we explain that one?
> > 
> > I've no clue what dbench does.. At this point you'll have to
> > expose/trace the per-task runtime accounting for these tasks and ideally
> > also the things the cgroup code does with them to see if it still makes
> > sense.
> 
> I see :)
> 
> BTW, some interesting thing we found during the dbench/stress testing
> is, by doing:
> 
> 	echo 240000000 > /proc/sys/kernel/sched_latency_ns
>         echo NO_GENTLE_FAIR_SLEEPERS > /sys/kernel/debug/sched_features
> 
> that is sched_latency_ns increased around 10 times and
> GENTLE_FAIR_SLEEPERS was disabled, the dbench got it's CPU back.
> 
> However, when the group level is too deep, that doesn't works any more...
> 
> I'm not sure but seems like 'deep group level' and 'vruntime bonus for
> sleeper' is the keep points here, will try to list the root cause after
> more investigation, thanks for the hints and suggestions, really helpful ;-)

How deep is deep? You run into numerical problems quite quickly, esp.
when you've got lots of CPUs. We've only got 64bit to play with, that
said there were some patches...

What happens if you do the below, Google has been running with that, and
nobody was ever able to reproduce the report that got it disabled.



diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index b2cbe81308af..e40819d39c69 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -40,7 +40,7 @@ extern void update_cpu_load_active(struct rq *this_rq);
  * when BITS_PER_LONG <= 32 are pretty high and the returns do not justify the
  * increased costs.
  */
-#if 0 /* BITS_PER_LONG > 32 -- currently broken: it increases power usage under light load  */
+#if 1 /* BITS_PER_LONG > 32 -- currently broken: it increases power usage under light load  */
 # define SCHED_LOAD_RESOLUTION	10
 # define scale_load(w)		((w) << SCHED_LOAD_RESOLUTION)
 # define scale_load_down(w)	((w) >> SCHED_LOAD_RESOLUTION)

Content of type "application/pgp-signature" skipped