linux-kernel - Re: [PATCH] sched: fix unfairness when upgrade weight

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1235550053.4645.3035.camel@laptop>
Date:	Wed, 25 Feb 2009 09:20:53 +0100
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	miaox@...fujitsu.com
Cc:	Ingo Molnar <mingo@...e.hu>,
	Linux-Kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sched: fix unfairness when upgrade weight

On Wed, 2009-02-25 at 15:32 +0800, Miao Xie wrote:

> This patch fixes this bug by tuning the vruntime of weight-upgraded
> sched entities, just like waking up a task. the new vruntime will be
>     cfs_rq->min_vruntime + sched_vslice();

I really don't like that.

Better would be to scale with min_vruntime, which would at least
approximate the lag somewhat.

Best is to compute the actual lag, but that might just not be worth the
extra overhead.

http://programming.kicks-ass.net/kernel-patches/sched-avg_vruntime/


> ---
>  kernel/sched.c      |   16 +++++++++-------
>  kernel/sched_fair.c |    9 +++++++++
>  2 files changed, 18 insertions(+), 7 deletions(-)
> 
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 410eec4..26e6d33 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -5096,12 +5096,8 @@ void set_user_nice(struct task_struct *p, long nice)
>  
>  	if (on_rq) {
>  		enqueue_task(rq, p, 0);
> -		/*
> -		 * If the task increased its priority or is running and
> -		 * lowered its priority, then reschedule its CPU:
> -		 */
> -		if (delta < 0 || (delta > 0 && task_running(rq, p)))
> -			resched_task(rq->curr);
> +		p->sched_class->prio_changed(rq, p, old_prio,
> +						task_running(rq, p));
>  	}
>  out_unlock:
>  	task_rq_unlock(rq, &flags);
> @@ -8929,16 +8925,22 @@ static void __set_se_shares(struct sched_entity *se, unsigned long shares)
>  {
>  	struct cfs_rq *cfs_rq = se->cfs_rq;
>  	int on_rq;
> +	unsigned long old_weight;
>  
>  	on_rq = se->on_rq;
>  	if (on_rq)
>  		dequeue_entity(cfs_rq, se, 0);
>  
> +	old_weight = se->load.weight;
>  	se->load.weight = shares;
>  	se->load.inv_weight = 0;
>  
> -	if (on_rq)
> +	if (on_rq) {
> +		if (se->load.weight > old_weight)
> +			se->vruntime = cfs_rq->min_vruntime
> +						+ sched_vslice(cfs_rq, se);
>  		enqueue_entity(cfs_rq, se, 0);
> +	}
>  }
>  
>  static void set_se_shares(struct sched_entity *se, unsigned long shares)
> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> index 0566f2a..34d4d11 100644
> --- a/kernel/sched_fair.c
> +++ b/kernel/sched_fair.c
> @@ -1690,6 +1690,15 @@ static void task_new_fair(struct rq *rq, struct task_struct *p)
>  static void prio_changed_fair(struct rq *rq, struct task_struct *p,
>  			      int oldprio, int running)
>  {
> +	struct cfs_rq *cfs_rq = task_cfs_rq(p);
> +	struct sched_entity *se = &p->se;
> +	int on_rq = se->on_rq;
> +
> +	if (p->prio < oldprio && on_rq) {
> +		dequeue_entity(cfs_rq, se, 0);
> +		se->vruntime = cfs_rq->min_vruntime + sched_vslice(cfs_rq, se);
> +		enqueue_entity(cfs_rq, se, 0);
> +	}

we very likely just enqueued the thing, and now we dequeue/enqueue
again.. not very nice.

>  	/*
>  	 * Reschedule if we are currently running on this runqueue and
>  	 * our priority decreased, or if we are not currently running on

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/