lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 15 Mar 2019 14:05:00 +0100
From:   Laurent Vivier <lvivier@...hat.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     linux-kernel@...r.kernel.org,
        Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
        Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
        Borislav Petkov <bp@...e.de>,
        David Gibson <david@...son.dropbear.id.au>,
        Michael Ellerman <mpe@...erman.id.au>,
        Nathan Fontenot <nfont@...ux.vnet.ibm.com>,
        Michael Bringmann <mwb@...ux.vnet.ibm.com>,
        linuxppc-dev@...ts.ozlabs.org, Ingo Molnar <mingo@...hat.com>
Subject: Re: [RFC v3] sched/topology: fix kernel crash when a CPU is
 hotplugged in a memoryless node

On 15/03/2019 13:25, Peter Zijlstra wrote:
> On Fri, Mar 15, 2019 at 12:12:45PM +0100, Laurent Vivier wrote:
> 
>> Another way to avoid the nodes overlapping for the offline nodes at
>> startup is to ensure the default values don't define a distance that
>> merge all offline nodes into node 0.
>>
>> A powerpc specific patch can workaround the kernel crash by doing this:
>>
>> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
>> index 87f0dd0..3ba29bb 100644
>> --- a/arch/powerpc/mm/numa.c
>> +++ b/arch/powerpc/mm/numa.c
>> @@ -623,6 +623,7 @@ static int __init parse_numa_properties(void)
>>         struct device_node *memory;
>>         int default_nid = 0;
>>         unsigned long i;
>> +       int nid, dist;
>>
>>         if (numa_enabled == 0) {
>>                 printk(KERN_WARNING "NUMA disabled by user\n");
>> @@ -636,6 +637,10 @@ static int __init parse_numa_properties(void)
>>
>>         dbg("NUMA associativity depth for CPU/Memory: %d\n",
>> min_common_depth);
>>
>> +       for (nid = 0; nid < MAX_NUMNODES; nid ++)
>> +               for (dist = 0; dist < MAX_DISTANCE_REF_POINTS; dist++)
>> +                       distance_lookup_table[nid][dist] = nid;
>> +
>>         /*
>>          * Even though we connect cpus to numa domains later in SMP
>>          * init, we need to know the node ids now. This is because
> 
> What does that actually do? That is, what does it make the distance
> table look like before and after you bring up the CPUs?

By default the table is full of 0. When a CPU is brought up the value is
read from the device-tree and the table is updated. What I've seen is
this value is common for 2 nodes at a given level if they share the level.
So as the table is initialized with 0, all offline nodes (no memory no
cpu) are merged with node 0.
My fix initializes the table with unique values for each node, so by
default no nodes are mixed.

> 
>> Any comment?
> 
> Well, I had a few questions here:
> 
>   20190305115952.GH32477@...ez.programming.kicks-ass.net
> 
> that I've not yet seen answers to.

I didn't answer because:

- I thought this was not the good way to fix the problem as you said "it
seems very fragile and unfortunate",

- I don't have the answers, I'd really like someone from IBM that knows
well the NUMA part of powerpc answers to these questions... and perhaps
find a better solution.

Thanks,
Laurent



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ