[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <faef20f5-80b4-fcb0-6460-ddae9856f35e@suse.cz>
Date: Thu, 8 Jun 2017 10:22:32 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Michal Hocko <mhocko@...nel.org>, linux-mm@...ck.org
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
Xishi Qiu <qiuxishi@...wei.com>,
zhong jiang <zhongjiang@...wei.com>,
Joonsoo Kim <js1304@...il.com>,
LKML <linux-kernel@...r.kernel.org>,
Michal Hocko <mhocko@...e.com>
Subject: Re: [PATCH 2/4] hugetlb, memory_hotplug: prefer to use reserved pages
for migration
On 06/08/2017 09:45 AM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@...e.com>
>
> new_node_page will try to use the origin's next NUMA node as the
> migration destination for hugetlb pages. If such a node doesn't have any
> preallocated pool it falls back to __alloc_buddy_huge_page_no_mpol to
> allocate a surplus page instead. This is quite subotpimal for any
> configuration when hugetlb pages are no distributed to all NUMA nodes
> evenly. Say we have a hotplugable node 4 and spare hugetlb pages are
> node 0
> /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages:10000
> /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages:0
> /sys/devices/system/node/node2/hugepages/hugepages-2048kB/nr_hugepages:0
> /sys/devices/system/node/node3/hugepages/hugepages-2048kB/nr_hugepages:0
> /sys/devices/system/node/node4/hugepages/hugepages-2048kB/nr_hugepages:10000
> /sys/devices/system/node/node5/hugepages/hugepages-2048kB/nr_hugepages:0
> /sys/devices/system/node/node6/hugepages/hugepages-2048kB/nr_hugepages:0
> /sys/devices/system/node/node7/hugepages/hugepages-2048kB/nr_hugepages:0
>
> Now we consume the whole pool on node 4 and try to offline this
> node. All the allocated pages should be moved to node0 which has enough
> preallocated pages to hold them. With the current implementation
> offlining very likely fails because hugetlb allocations during runtime
> are much less reliable.
>
> Fix this by reusing the nodemask which excludes migration source and try
> to find a first node which has a page in the preallocated pool first and
> fall back to __alloc_buddy_huge_page_no_mpol only when the whole pool is
> consumed.
>
> Signed-off-by: Michal Hocko <mhocko@...e.com>
Acked-by: Vlastimil Babka <vbabka@...e.cz>
Powered by blists - more mailing lists