linux-kernel - Re: [ANNOUNCE/RFC] Really Simple Really Fair Scheduler

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.0709021106230.30970@tongli.jf.intel.com>
Date:	Sun, 2 Sep 2007 12:12:30 -0700 (PDT)
From:	Tong Li <tong.n.li@...el.com>
To:	Ingo Molnar <mingo@...e.hu>
cc:	Roman Zippel <zippel@...ux-m68k.org>, linux-kernel@...r.kernel.org,
	peterz@...radead.org, Mike Galbraith <efault@....de>
Subject: Re: [ANNOUNCE/RFC] Really Simple Really Fair Scheduler

I like this patch since it's really simple. CFS does provide a nice 
infrastructure to enable new algorithmic changes/extensions. My only 
concern was the O(log N) complexity under heavy load, but I'm willing to 
agree that it's OK in the common case. Some comments on the code:

> * Ingo Molnar <mingo@...e.hu> wrote:
>
> static void update_stats_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se)
> {
> -	s64 key;
> +	u64 key;
>
> 	/*
> 	 * Are we enqueueing a waiting task? (for current tasks
> @@ -442,26 +448,7 @@ static void update_stats_enqueue(struct
> 	/*
> 	 * Update the key:
> 	 */
> -	key = cfs_rq->fair_clock;
> -
> -	/*
> -	 * Optimize the common nice 0 case:
> -	 */
> -	if (likely(se->load.weight == NICE_0_LOAD)) {
> -		key -= se->wait_runtime;
> -	} else {
> -		u64 tmp;
> -
> -		if (se->wait_runtime < 0) {
> -			tmp = -se->wait_runtime;
> -			key += (tmp * se->load.inv_weight) >>
> -					(WMULT_SHIFT - NICE_0_SHIFT);
> -		} else {
> -			tmp = se->wait_runtime;
> -			key -= (tmp * se->load.inv_weight) >>
> -					(WMULT_SHIFT - NICE_0_SHIFT);
> -		}
> -	}
> +	key = se->exec_runtime;
>
> 	se->fair_key = key;
> }

Should we use the weighted fair clock exec_runtime as the key? This way 
tasks with larger weights will have their keys incremented more slowly and 
thus be given more CPU time. This is what other virtual-clock based fair 
scheduling algorithms commonly do.

> @@ -583,6 +570,20 @@ static void __enqueue_sleeper(struct cfs
> 	cfs_rq->sleeper_bonus += delta_fair;
> }
>
> +/*
> + * Newly woken tasks are put into the "middle" of all runnable
> + * task's current runtime:
> + */
> +static u64 avg_exec_runtime(struct cfs_rq *cfs_rq)
> +{
> +	u64 avg_exec_runtime = cfs_rq->exec_runtime;
> +
> +	if (cfs_rq->nr_running)
> +		do_div(avg_exec_runtime, cfs_rq->nr_running);
> +
> +	return avg_exec_runtime;
> +}
> +
> static void enqueue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se)
> {
> 	struct task_struct *tsk = task_of(se);
> @@ -640,8 +641,10 @@ enqueue_entity(struct cfs_rq *cfs_rq, st
> 	 */
> 	update_curr(cfs_rq);
>
> -	if (wakeup)
> +	if (wakeup) {
> +		se->exec_runtime = avg_exec_runtime(cfs_rq);
> 		enqueue_sleeper(cfs_rq, se);
> +	}
>
> 	update_stats_enqueue(cfs_rq, se);
> 	__enqueue_entity(cfs_rq, se);
> @@ -1126,6 +1129,7 @@ static void task_new_fair(struct rq *rq,
> 		schedstat_add(cfs_rq, wait_runtime, se->wait_runtime);
> 	}
>
> +	se->exec_runtime = avg_exec_runtime(cfs_rq);
> 	__enqueue_entity(cfs_rq, se);
> }

What's the intuition behind avg_exec_runtime? I thought the original CFS 
approach, i.e., setting a newly arriving task's key to be the current fair 
clock, adjusted by wait_runtime, was good. It matches other fair queuing 
algorithms and thus has provably good properties.

   tong
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/