linux-kernel - Re: [patch v8 9/9] sched/tg: remove blocked_load

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAPM31RJ_0xgymfN+5FzQCSrv-qgpT+42EOmWke1rOGy7GfHcYg@mail.gmail.com>
Date:	Mon, 17 Jun 2013 05:20:46 -0700
From:	Paul Turner <pjt@...gle.com>
To:	Alex Shi <alex.shi@...el.com>
Cc:	Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Borislav Petkov <bp@...en8.de>,
	Namhyung Kim <namhyung@...nel.org>,
	Mike Galbraith <efault@....de>,
	Morten Rasmussen <morten.rasmussen@....com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Preeti U Murthy <preeti@...ux.vnet.ibm.com>,
	Viresh Kumar <viresh.kumar@...aro.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Mel Gorman <mgorman@...e.de>, Rik van Riel <riel@...hat.com>,
	Michael Wang <wangyun@...ux.vnet.ibm.com>,
	Jason Low <jason.low2@...com>,
	Changlong Xie <changlongx.xie@...el.com>, sgruszka@...hat.com,
	Frédéric Weisbecker <fweisbec@...il.com>
Subject: Re: [patch v8 9/9] sched/tg: remove blocked_load_avg in balance

On Fri, Jun 7, 2013 at 12:20 AM, Alex Shi <alex.shi@...el.com> wrote:
> blocked_load_avg sometime is too heavy and far bigger than runnable load
> avg, that make balance make wrong decision. So remove it.

Ok so this is  going to have terrible effects on the correctness of
shares distribution; I'm fairly opposed to it in its present form.

So let's see, what could be happening..

In  "sched: compute runnable load avg in cpu_load and
cpu_avg_load_per_task" you already update the load average weights
solely based on current runnable load.  While this is generally poor
for stability (and I suspect the benefit is coming largely from
weighted_cpuload() where you do want to use runnable_load_avg and not
get_rq_runnable_load() where I suspect including blocked_load_avg() is
correct in the longer term).

Ah so.. I have an inkling:
  Inside weighted_cpuload() where you're trying to use only
runnable_load_avg; this is in-fact still including blocked_load_avg
for a cgroup since in the cgroup case a group entities' contribution
is a function of both runnable and blocked load.

Having weighted_cpuload() pull rq->load (possibly moderated by
rq->avg) would reasonably avoid this since issued shares are
calculated using instantaneous weights, without breaking the actual
model for how much load overall that we believe the group has.

>
> Changlong tested this patch, found ltp cgroup stress testing get better
> performance: https://lkml.org/lkml/2013/5/23/65
> ---
> 3.10-rc1          patch1-7         patch1-8
> duration=764   duration=754   duration=750
> duration=764   duration=754   duration=751
> duration=763   duration=755   duration=751
>
> duration means the seconds of testing cost.
> ---
>
> And Jason also tested this patchset on his 8 sockets machine:
> https://lkml.org/lkml/2013/5/29/673
> ---
> When using a 3.10-rc2 tip kernel with patches 1-8, there was about a 40%
> improvement in performance of the workload compared to when using the
> vanilla 3.10-rc2 tip kernel with no patches. When using a 3.10-rc2 tip
> kernel with just patches 1-7, the performance improvement of the
> workload over the vanilla 3.10-rc2 tip kernel was about 25%.
> ---
>
> Signed-off-by: Alex Shi <alex.shi@...el.com>
> Tested-by: Changlong Xie <changlongx.xie@...el.com>
> Tested-by: Jason Low <jason.low2@...com>
> ---
>  kernel/sched/fair.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 3aa1dc0..985d47e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -1358,7 +1358,7 @@ static inline void __update_cfs_rq_tg_load_contrib(struct cfs_rq *cfs_rq,
>         struct task_group *tg = cfs_rq->tg;
>         s64 tg_contrib;
>
> -       tg_contrib = cfs_rq->runnable_load_avg + cfs_rq->blocked_load_avg;
> +       tg_contrib = cfs_rq->runnable_load_avg;
>         tg_contrib -= cfs_rq->tg_load_contrib;
>
>         if (force_update || abs64(tg_contrib) > cfs_rq->tg_load_contrib / 8) {
> --
> 1.7.12
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/