linux-kernel - Re: [PATCH v2] sched: let __sched_period() use rq's nr

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20150714092619.GC3956@byungchulpark-X58A-UD3R>
Date:	Tue, 14 Jul 2015 18:26:19 +0900
From:	Byungchul Park <byungchul.park@....com>
To:	mingo@...nel.org, peterz@...radead.org
Cc:	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] sched: let __sched_period() use rq's nr_running

On Fri, Jul 10, 2015 at 05:11:30PM +0900, byungchul.park@....com wrote:
> From: Byungchul Park <byungchul.park@....com>
> 
> __sched_period() returns a period which a rq can have. the period has to be
> stretched by the number of task *the rq has*, when nr_running > nr_latency.
> otherwise, task slice can be very smaller than sysctl_sched_min_granularity
> depending on the position of tg hierarchy when CONFIG_FAIR_GROUP_SCHED.

hello all,

the sysctl_sched_min_granularity must be defined clearly at first. after
defining that clearly, the way to work can be set. the definition can
be either case 1 or case 2 below.

case 1. any task must have at least sysctl_sched_min_granularity slice, which
is currently 0.75ms. in this case, increasing the number of tasks in a rq can
cause stretching a whole latency, which most of you don't like because it can
stretch the whole latency too much. but it looks normal to me since it already 
happens in !CONFIG_FAIR_GROUP_SCHED world with the large number of tasks.
i wonder why CONFIG_FAIR_GROUP_SCHED world must be different with 
!CONFIG_FAIR_GROUP_SCHED world? anyway...

case 2. tasks can have a slice much smaller than sysctl_sched_min_granularity,
according to the position in hierarchy. if a rq has 8 same weighted sched
entities and each entities has 8 same weighted sched entities and do it one
more, then a task can have a very small slice, e.g. 0.75ms / 64 ~ 0.01ms.
if you add more level to cgroup, it would get worse. in this situation,
context switching overhead becomes very large. what does it mean
sysctl_sched_min_granularity here? anyway...

i am not sure which is the right definition of sysctl_sched_min_granularity
between case 1 and case 2. what do you think about this?

thank you,
byungchul

> 
> Signed-off-by: Byungchul Park <byungchul.park@....com>
> ---
>  kernel/sched/fair.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 09456fc..8ae7aeb 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -635,7 +635,7 @@ static u64 __sched_period(unsigned long nr_running)
>   */
>  static u64 sched_slice(struct cfs_rq *cfs_rq, struct sched_entity *se)
>  {
> -	u64 slice = __sched_period(cfs_rq->nr_running + !se->on_rq);
> +	u64 slice = __sched_period(rq_of(cfs_rq)->nr_running + !se->on_rq);
>  
>  	for_each_sched_entity(se) {
>  		struct load_weight *load;
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/