lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 16 Nov 2021 20:22:49 +0000
From:   Alexey Makhalov <amakhalov@...are.com>
To:     Michal Hocko <mhocko@...e.com>
CC:     Dennis Zhou <dennis@...nel.org>,
        Eric Dumazet <eric.dumazet@...il.com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        David Hildenbrand <david@...hat.com>,
        Oscar Salvador <osalvador@...e.de>, Tejun Heo <tj@...nel.org>,
        Christoph Lameter <cl@...ux.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [PATCH v3] mm: fix panic in __alloc_pages



> On Nov 16, 2021, at 1:17 AM, Michal Hocko <mhocko@...e.com> wrote:
> 
> On Tue 16-11-21 01:31:44, Alexey Makhalov wrote:
> [...]
>> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
>> index 6737b1cbf..bbc1a70d5 100644
>> --- a/drivers/acpi/acpi_processor.c
>> +++ b/drivers/acpi/acpi_processor.c
>> @@ -200,6 +200,10 @@ static int acpi_processor_hotadd_init(struct acpi_processor *pr)
>>        * gets online for the first time.
>>        */
>>       pr_info("CPU%d has been hot-added\n", pr->id);
>> +       {
>> +               int nid = cpu_to_node(pr->id);
>> +               printk("%s:%d cpu %d, node %d, online %d, ndata %p\n", __FUNCTION__, __LINE__, pr->id, nid, node_online(nid), NODE_DATA(nid));
>> +       }
>>       pr->flags.need_hotplug_init = 1;
> 
> OK, IIUC you are adding a processor which is outside of
> possible_cpu_mask and that means that the node is not allocated for such
> a future to be hotplugged cpu and its memory node. init_cpu_to_node
> would have done that initialization otherwise.
It is not correct.

possible_cpus is 128 for this VM. Look at SRAT and percpu output for proof.
[    0.085524] SRAT: PXM 127 -> APIC 0xfe -> Node 127
[    0.118928] setup_percpu: NR_CPUS:128 nr_cpumask_bits:128 nr_cpu_ids:128 nr_node_ids:128

It is impossible to add processor outside of possible_cpu_mask. possible_cpus is absolute maximum
that system can support. See Documentation/core-api/cpu_hotplug.rst

Number of present and onlined CPUs (and nodes) is 4. Other 124 CPUs (and nodes) are not present, but can
be potentially hot added.
Number of initialized nodes is 4, as init_cpu_to_node() will skip not yet present nodes,
see arch/x86/mm/numa.c:798 (numa_cpu_node(CPU #4) == NUMA_NO_NODE)
788 void __init init_cpu_to_node(void)
789 {
790         int cpu;
791         u16 *cpu_to_apicid = early_per_cpu_ptr(x86_cpu_to_apicid);
792
793         BUG_ON(cpu_to_apicid == NULL);
794
795         for_each_possible_cpu(cpu) {
796                 int node = numa_cpu_node(cpu);
797
798                 if (node == NUMA_NO_NODE)
799                         continue;
800

After CPU (and node) hot plug:
- CPU 4 is marker as present, but not yet online
- New node got ID 4. numa_cpu_node(CPU #4) returns 4
- node_online(4) == 0 and NODE_DATA(4) == NULL, but it will be accessed inside
for_each_possible_cpu loop in percpu allocation.

Digging further.
Even if x86/CPU hot add maintainers decide to clean up memoryless node hot add code to initialize the node on time of
attaching it (to be aligned with mm node while memory hot add), this percpu fix is still needed as it is used during
the node onlining, See chicken and egg problem that I described above.
Or as 2nd option, numa_cpu_node(4) should return NUMA_NO_NODE until node 4 get fully initialized.

Regards,
—Alexey



Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ