lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b2e4a611-45a6-732a-a6d3-6042afd2af6e@redhat.com>
Date:   Tue, 2 Nov 2021 10:24:43 +0100
From:   David Hildenbrand <david@...hat.com>
To:     Michal Hocko <mhocko@...e.com>,
        Alexey Makhalov <amakhalov@...are.com>
Cc:     "linux-mm@...ck.org" <linux-mm@...ck.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "stable@...r.kernel.org" <stable@...r.kernel.org>,
        Oscar Salvador <OSalvador@...e.com>
Subject: Re: [PATCH] mm: fix panic in __alloc_pages

>>> In add_memory_resource() we hotplug the new node if required and set it
>>> online. Memory might get onlined later, via online_pages().
>>
>> You are correct. In case of memory hot add, it is true. But in case of adding
>> CPU with memoryless node, try_node_online() will be called only during CPU
>> onlining, see cpu_up().
>>
>> Is there any reason why try_online_node() resides in cpu_up() and not in add_cpu()?
>> I think it would be correct to online node during the CPU hot add to align with
>> memory hot add.
> 
> I am not familiar with cpu hotplug, but this doesn't seem to be anything
> new so how come this became problem only now?

So IIUC, the issue is that we have a node

a) That has no memory
b) That is offline

This node will get onlined when onlining the CPU as Alexey says. Yet we
have some code that stumbles over the node and goes ahead trying to use
the pgdat -- that code is broken.


If we take a look at build_zonelists() we indeed skip any
!node_online(node). Any other code should do the same. If the node is
not online, it shall be ignored because we might not even have a pgdat
yet -- see hotadd_new_pgdat(). Without node_online(), the pgdat might be
stale or non-existant.


The node onlining logic when onlining a CPU sounds bogus as well: Let's
take a look at try_offline_node(). It checks that:
1) That no memory is *present*
2) That no CPU is *present*

We should online the node when adding the CPU ("present"), not when
onlining the CPU.

-- 
Thanks,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ