[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b7fe32e3-ed06-3701-9ec4-f8f8b838bc9a@redhat.com>
Date: Mon, 6 Dec 2021 09:29:18 +0100
From: David Hildenbrand <david@...hat.com>
To: Nico Pache <npache@...hat.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, akpm@...ux-foundation.org,
Michal Hocko <mhocko@...nel.org>
Cc: shakeelb@...gle.com, ktkhai@...tuozzo.com, shy828301@...il.com,
guro@...com, vbabka@...e.cz, vdavydov.dev@...il.com,
raquini@...hat.com, Rafael Aquini <aquini@...hat.com>
Subject: Re: [RFC PATCH 1/2] include/linux/gfp.h: Do not allocate pages on a
offlined node
On 06.12.21 04:33, Nico Pache wrote:
> We shouldn't allocate pages on a unavailable node. Add a check for this
> in __alloc_pages_node and return NULL to avoid issues further down the
> callchain.
>
> Also update the VM_WARN_ON in __alloc_pages_node which could skip this
> warn if the gfp_mask is not GFP_THISNODE.
>
> Co-developed-by: Rafael Aquini <aquini@...hat.com>
> Signed-off-by: Nico Pache <npache@...hat.com>
> ---
> include/linux/gfp.h | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index b976c4177299..e7e18f6d0d9d 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -565,7 +565,10 @@ static inline struct page *
> __alloc_pages_node(int nid, gfp_t gfp_mask, unsigned int order)
> {
> VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
> - VM_WARN_ON((gfp_mask & __GFP_THISNODE) && !node_online(nid));
> + VM_WARN_ON(!node_online(nid));
> +
> + if (!node_online(nid))
> + return NULL;
>
> return __alloc_pages(gfp_mask, order, nid, NULL);
> }
>
We had a related discussion just a month ago or so, and callers should
make sure to not allocate pages on an offline node. No need for runtime
checks for each end every allocation. The question is usually, where do
callers get the random offline node from. See:
https://lkml.kernel.org/r/20211108202325.20304-1-amakhalov@vmware.com
There was also a discussion about turning this at least into
VM_WARN_ON(!node_online(nid)), however, there are valid points that this
won't really help much. We'd need a BUG_ON, but we'll already crash
properly later when de-referencing the NULL pgdat pointer.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists