lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 2 Nov 2021 10:24:43 +0100 From: David Hildenbrand <david@...hat.com> To: Michal Hocko <mhocko@...e.com>, Alexey Makhalov <amakhalov@...are.com> Cc: "linux-mm@...ck.org" <linux-mm@...ck.org>, Andrew Morton <akpm@...ux-foundation.org>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "stable@...r.kernel.org" <stable@...r.kernel.org>, Oscar Salvador <OSalvador@...e.com> Subject: Re: [PATCH] mm: fix panic in __alloc_pages >>> In add_memory_resource() we hotplug the new node if required and set it >>> online. Memory might get onlined later, via online_pages(). >> >> You are correct. In case of memory hot add, it is true. But in case of adding >> CPU with memoryless node, try_node_online() will be called only during CPU >> onlining, see cpu_up(). >> >> Is there any reason why try_online_node() resides in cpu_up() and not in add_cpu()? >> I think it would be correct to online node during the CPU hot add to align with >> memory hot add. > > I am not familiar with cpu hotplug, but this doesn't seem to be anything > new so how come this became problem only now? So IIUC, the issue is that we have a node a) That has no memory b) That is offline This node will get onlined when onlining the CPU as Alexey says. Yet we have some code that stumbles over the node and goes ahead trying to use the pgdat -- that code is broken. If we take a look at build_zonelists() we indeed skip any !node_online(node). Any other code should do the same. If the node is not online, it shall be ignored because we might not even have a pgdat yet -- see hotadd_new_pgdat(). Without node_online(), the pgdat might be stale or non-existant. The node onlining logic when onlining a CPU sounds bogus as well: Let's take a look at try_offline_node(). It checks that: 1) That no memory is *present* 2) That no CPU is *present* We should online the node when adding the CPU ("present"), not when onlining the CPU. -- Thanks, David / dhildenb
Powered by blists - more mailing lists