[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0d67b3b7-cf2f-61f3-c67a-76e85e05a3ee@amd.com>
Date: Fri, 3 Sep 2021 10:13:59 +0530
From: Bharata B Rao <bharata@....com>
To: linux-mm@...ck.org, linux-kernel@...r.kernel.org
Cc: akpm@...ux-foundation.org, kamezawa.hiroyu@...fujitsu.com,
mgorman@...e.de, Krupa.Ramakrishnan@....com,
Sadagopan.Srinivasan@....com
Subject: Re: [FIX PATCH 2/2] mm/page_alloc: Use accumulated load when building
node fallback list
On 8/30/2021 5:46 PM, Bharata B Rao wrote:
> From: Krupa Ramakrishnan <krupa.ramakrishnan@....com>
>
> In build_zonelists(), when the fallback list is built for the nodes,
> the node load gets reinitialized during each iteration. This results
> in nodes with same distances occupying the same slot in different
> node fallback lists rather than appearing in the intended round-
> robin manner. This results in one node getting picked for allocation
> more compared to other nodes with the same distance.
>
> As an example, consider a 4 node system with the following distance
> matrix.
>
> Node 0 1 2 3
> ----------------
> 0 10 12 32 32
> 1 12 10 32 32
> 2 32 32 10 12
> 3 32 32 12 10
>
> For this case, the node fallback list gets built like this:
>
> Node Fallback list
> ---------------------
> 0 0 1 2 3
> 1 1 0 3 2
> 2 2 3 0 1
> 3 3 2 0 1 <-- Unexpected fallback order
FWIW, for a dual-socket 8 node system with the following distance matrix,
node 0 1 2 3 4 5 6 7
0: 10 12 12 12 32 32 32 32
1: 12 10 12 12 32 32 32 32
2: 12 12 10 12 32 32 32 32
3: 12 12 12 10 32 32 32 32
4: 32 32 32 32 10 12 12 12
5: 32 32 32 32 12 10 12 12
6: 32 32 32 32 12 12 10 12
7: 32 32 32 32 12 12 12 10
the fallback list looks like this:
Before
=======
Fallback order for Node 0: 0 1 2 3 4 5 6 7
Fallback order for Node 1: 1 2 3 0 5 6 7 4
Fallback order for Node 2: 2 3 0 1 6 7 4 5
Fallback order for Node 3: 3 0 1 2 7 4 5 6
Fallback order for Node 4: 4 5 6 7 0 1 2 3
Fallback order for Node 5: 5 6 7 4 0 1 2 3
Fallback order for Node 6: 6 7 4 5 0 1 2 3
Fallback order for Node 7: 7 4 5 6 0 1 2 3
After the fix
==============
Fallback order for Node 0: 0 1 2 3 4 5 6 7
Fallback order for Node 1: 1 2 3 0 5 6 7 4
Fallback order for Node 2: 2 3 0 1 6 7 4 5
Fallback order for Node 3: 3 0 1 2 7 4 5 6
Fallback order for Node 4: 4 5 6 7 0 1 2 3
Fallback order for Node 5: 5 6 7 4 1 2 3 0
Fallback order for Node 6: 6 7 4 5 2 3 0 1
Fallback order for Node 7: 7 4 5 6 3 0 1 2
So the problem becomes more pronounced for bigger NUMA systems.
Regards,
Bharata.
Powered by blists - more mailing lists