linux-kernel - Re: [ISSUE] sched/cgroup: Does cpu-cgroup still works fine nowadays?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <53748A5D.6070605@linux.vnet.ibm.com>
Date:	Thu, 15 May 2014 17:35:25 +0800
From:	Michael wang <wangyun@...ux.vnet.ibm.com>
To:	Peter Zijlstra <peterz@...radead.org>
CC:	Rik van Riel <riel@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...nel.org>, Mike Galbraith <efault@....de>,
	Alex Shi <alex.shi@...aro.org>, Paul Turner <pjt@...gle.com>,
	Mel Gorman <mgorman@...e.de>,
	Daniel Lezcano <daniel.lezcano@...aro.org>
Subject: Re: [ISSUE] sched/cgroup: Does cpu-cgroup still works fine nowadays?

On 05/15/2014 05:06 PM, Peter Zijlstra wrote:
[snip]
>> However, when the group level is too deep, that doesn't works any more...
>>
>> I'm not sure but seems like 'deep group level' and 'vruntime bonus for
>> sleeper' is the keep points here, will try to list the root cause after
>> more investigation, thanks for the hints and suggestions, really helpful ;-)
> 
> How deep is deep? You run into numerical problems quite quickly, esp.
> when you've got lots of CPUs. We've only got 64bit to play with, that
> said there were some patches...

It's like:

	/cgroup/cpu/l1/l2/l3/l4/l5/l6/A

about level 7, the issue can not be solved any more.

> 
> What happens if you do the below, Google has been running with that, and
> nobody was ever able to reproduce the report that got it disabled.
> 
> 
> 
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index b2cbe81308af..e40819d39c69 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -40,7 +40,7 @@ extern void update_cpu_load_active(struct rq *this_rq);
>   * when BITS_PER_LONG <= 32 are pretty high and the returns do not justify the
>   * increased costs.
>   */
> -#if 0 /* BITS_PER_LONG > 32 -- currently broken: it increases power usage under light load  */
> +#if 1 /* BITS_PER_LONG > 32 -- currently broken: it increases power usage under light load  */

That is trying to solve the load overflow issue, correct?

I'm not sure which account will turns to be huge when group get deeper,
the load accumulation will suffer discount when passing up, isn't it?

Anyway, will give it a try and see what happened :)

Regards,
Michael Wang

>  # define SCHED_LOAD_RESOLUTION	10
>  # define scale_load(w)		((w) << SCHED_LOAD_RESOLUTION)
>  # define scale_load_down(w)	((w) >> SCHED_LOAD_RESOLUTION)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/