[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200320152251.GC3818@techsingularity.net>
Date: Fri, 20 Mar 2020 15:22:51 +0000
From: Mel Gorman <mgorman@...hsingularity.net>
To: Jirka Hladky <jhladky@...hat.com>
Cc: Phil Auld <pauld@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Juri Lelli <juri.lelli@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>,
Valentin Schneider <valentin.schneider@....com>,
Hillf Danton <hdanton@...a.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 00/13] Reconcile NUMA balancing decisions with the load
balancer v6
On Fri, Mar 20, 2020 at 03:37:44PM +0100, Jirka Hladky wrote:
> Hi Mel,
>
> just a quick update. I have increased the testing coverage and other tests
> from the NAS shows a big performance drop for the low number of threads as
> well:
>
> sp_C_x - show still the biggest drop upto 50%
> bt_C_x - performance drop upto 40%
> ua_C_x - performance drop upto 30%
>
MPI or OMP and what is a low thread count? For MPI at least, I saw a 0.4%
gain on an 4-node machine for bt_C and a 3.88% regression on 8-nodes. I
think it must be OMP you are using because I found I had to disable UA
for MPI at some point in the past for reasons I no longer remember.
> My point is that the performance drop for the low number of threads is more
> common than we have initially thought.
>
> Let me know what you need more data.
>
I just a clarification on the thread count and a confirmation it's OMP. For
MPI, I did note that some of the other NAS kernels shows a slight dip but
it was nowhere near as severe as SP and the problem was the same as more --
two or more tasks stayed on the same node without spreading out because
there was no pressure to do so. There was enough CPU and memory capacity
with no obvious pattern that could be used to spread the load wide early.
One possibility would be to spread wide always at clone time and assume
wake_affine will pull related tasks but it's fragile because it breaks
if the cloned task execs and then allocates memory from a remote node
only to migrate to a local node immediately.
--
Mel Gorman
SUSE Labs
Powered by blists - more mailing lists