[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180910084237.GC48257@gmail.com>
Date: Mon, 10 Sep 2018 10:42:37 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>,
Mel Gorman <mgorman@...hsingularity.net>,
Rik van Riel <riel@...riel.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH 1/6] sched/numa: Stop multiple tasks from moving to the
cpu at the same time
* Srikar Dronamraju <srikar@...ux.vnet.ibm.com> wrote:
> Task migration under numa balancing can happen in parallel. More than
> one task might choose to migrate to the same cpu at the same time. This
> can result in
> - During task swap, choosing a task that was not part of the evaluation.
> - During task swap, task which just got moved into its preferred node,
> moving to a completely different node.
> - During task swap, task failing to move to the preferred node, will have
> to wait an extra interval for the next migrate opportunity.
> - During task movement, multiple task movements can cause load imbalance.
Please capitalize both 'CPU' and 'NUMA' in changelogs and code comments.
> This problem is more likely if there are more cores per node or more
> nodes in the system.
>
> Use a per run-queue variable to check if numa-balance is active on the
> run-queue.
>
> specjbb2005 / bops/JVM / higher bops are better
> on 2 Socket/2 Node Intel
> JVMS Prev Current %Change
> 4 199709 206350 3.32534
> 1 330830 319963 -3.28477
>
>
> on 2 Socket/4 Node Power8 (PowerNV)
> JVMS Prev Current %Change
> 8 89011.9 89627.8 0.69193
> 1 218946 211338 -3.47483
>
>
> on 2 Socket/2 Node Power9 (PowerNV)
> JVMS Prev Current %Change
> 4 180473 186539 3.36117
> 1 212805 220344 3.54268
>
>
> on 4 Socket/4 Node Power7
> JVMS Prev Current %Change
> 8 56941.8 56836 -0.185804
> 1 111686 112970 1.14965
>
>
> dbench / transactions / higher numbers are better
> on 2 Socket/2 Node Intel
> count Min Max Avg Variance %Change
> 5 12029.8 12124.6 12060.9 34.0076
> 5 13136.1 13170.2 13150.2 14.7482 9.03166
>
>
> on 2 Socket/4 Node Power8 (PowerNV)
> count Min Max Avg Variance %Change
> 5 4968.51 5006.62 4981.31 13.4151
> 5 4319.79 4998.19 4836.53 261.109 -2.90646
>
>
> on 2 Socket/2 Node Power9 (PowerNV)
> count Min Max Avg Variance %Change
> 5 9342.92 9381.44 9363.92 12.8587
> 5 9325.56 9402.7 9362.49 25.9638 -0.0152714
>
>
> on 4 Socket/4 Node Power7
> count Min Max Avg Variance %Change
> 5 143.4 188.892 170.225 16.9929
> 5 132.581 191.072 170.554 21.6444 0.193274
I have applied this patch, but the zero comments benchmark dump is annoying, as the numbers do
not show unconditional advantages - there's some increases in performance and some regressions.
In particular this:
> dbench / transactions / higher numbers are better
> on 2 Socket/4 Node Power8 (PowerNV)
> count Min Max Avg Variance %Change
> 5 4968.51 5006.62 4981.31 13.4151
> 5 4319.79 4998.19 4836.53 261.109 -2.90646
is concerning: not only did we lose some performance, variance went up by a *lot*. Is this just
a measurement fluke? We cannot know and you didn't comment.
Thanks,
Ingo
Powered by blists - more mailing lists