linux-kernel - Re: [Resend patch v8 06/13] sched: compute runnable load avg in cpu_load and cpu_avg_load_per

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAPM31RJWnbTVmi89DO_P3k-b695kxVrNTUPSbqEFh5yaMFzu1A@mail.gmail.com>
Date:	Mon, 24 Jun 2013 03:54:26 -0700
From:	Paul Turner <pjt@...gle.com>
To:	Alex Shi <alex.shi@...el.com>
Cc:	Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Arjan van de Ven <arjan@...ux.intel.com>,
	Borislav Petkov <bp@...en8.de>,
	Namhyung Kim <namhyung@...nel.org>,
	Mike Galbraith <efault@....de>,
	Morten Rasmussen <morten.rasmussen@....com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	gregkh@...uxfoundation.org,
	Preeti U Murthy <preeti@...ux.vnet.ibm.com>,
	Viresh Kumar <viresh.kumar@...aro.org>,
	LKML <linux-kernel@...r.kernel.org>, len.brown@...el.com,
	rafael.j.wysocki@...el.com, jkosina@...e.cz,
	Clark Williams <clark.williams@...il.com>, tony.luck@...el.com,
	keescook@...omium.org, Mel Gorman <mgorman@...e.de>,
	Rik van Riel <riel@...hat.com>
Subject: Re: [Resend patch v8 06/13] sched: compute runnable load avg in
 cpu_load and cpu_avg_load_per_task

On Mon, Jun 24, 2013 at 2:06 AM, Alex Shi <alex.shi@...el.com> wrote:
> On 06/20/2013 10:18 AM, Alex Shi wrote:
>> They are the base values in load balance, update them with rq runnable
>> load average, then the load balance will consider runnable load avg
>> naturally.
>>
>> We also try to include the blocked_load_avg as cpu load in balancing,
>> but that cause kbuild performance drop 6% on every Intel machine, and
>> aim7/oltp drop on some of 4 CPU sockets machines.
>> Or only add blocked_load_avg into get_rq_runable_load, hackbench still
>> drop a little on NHM EX.
>>
>> Signed-off-by: Alex Shi <alex.shi@...el.com>
>> Reviewed-by: Gu Zheng <guz.fnst@...fujitsu.com>
>
>
> I am sorry for still having some swing on cfs and rt task load consideration.
> So give extra RFC patch to consider RT load in balance.
> With or without this patch, my test result has no change, since there is no
> much RT tasks in my testing.


>
> I am not familiar with RT scheduler, just rely on PeterZ who is experts on this. :)
>
> ---
>
> From b9ed5363b0a579a87256b589278c8c66500c7db3 Mon Sep 17 00:00:00 2001
> From: Alex Shi <alex.shi@...el.com>
> Date: Mon, 24 Jun 2013 16:12:29 +0800
> Subject: [PATCH 08/16] sched: recover to whole rq load include rt tasks'
>
> patch 'sched: compute runnable load avg in cpu_load and
> cpu_avg_load_per_task' weight rq's load on cfs.runnable_load_avg instead
> of rq->load.weight. That is fine when system has no much RT load.
>
> But if there are lots of RT load on rq, that code will just
> weight the cfs tasks in load balance without consideration of RT, that
> may cause load imbalance if much RT load isn't imbalanced among cpu.
> Using rq->avg.load_avg_contrib can resolve this problem and keep the
> advantages from runnable load balance.

I think this patch confuses what "load_avg_contrib" is.

It's the rate-limited (runnable_load_avg + blocked_load_avg[*]) value
that we've currently accumulated into the task_group for the
observation of an individual cpu's runnable+blocked load.
[*] Supposing you're appending this to the end of your current series
you in fact have it as just: cfs_rq->runnable_load_avg

This patch will do nothing for RT load.  It's mostly a no-op which is why
you measured no change.

> BTW, this patch may increase the balance failed times, if move_tasks can
> not balance loads between CPUs, like there is only RT load in CPUs.
>
> Signed-off-by: Alex Shi <alex.shi@...el.com>
> ---
>  kernel/sched/fair.c | 4 ++--
>  kernel/sched/proc.c | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 37a5720..6979906 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2968,7 +2968,7 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
>  /* Used instead of source_load when we know the type == 0 */
>  static unsigned long weighted_cpuload(const int cpu)
>  {
> -       return cpu_rq(cpu)->cfs.runnable_load_avg;
> +       return cpu_rq(cpu)->avg.load_avg_contrib;

This is a bad idea.  Neither value is really what's intended by
"type==0", but load_avg_contrib is even more stale.

>  }
>
>  /*
> @@ -3013,7 +3013,7 @@ static unsigned long cpu_avg_load_per_task(int cpu)
>  {
>         struct rq *rq = cpu_rq(cpu);
>         unsigned long nr_running = ACCESS_ONCE(rq->nr_running);
> -       unsigned long load_avg = rq->cfs.runnable_load_avg;
> +       unsigned long load_avg = rq->avg.load_avg_contrib;
>
>         if (nr_running)
>                 return load_avg / nr_running;
> diff --git a/kernel/sched/proc.c b/kernel/sched/proc.c
> index ce5cd48..4f2490c 100644
> --- a/kernel/sched/proc.c
> +++ b/kernel/sched/proc.c
> @@ -504,7 +504,7 @@ static void __update_cpu_load(struct rq *this_rq, unsigned long this_load,
>  #ifdef CONFIG_SMP
>  unsigned long get_rq_runnable_load(struct rq *rq)
>  {
> -       return rq->cfs.runnable_load_avg;
> +       return rq->avg.load_avg_contrib;
>  }
>  #else
>  unsigned long get_rq_runnable_load(struct rq *rq)
> --
> 1.7.12
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/