lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 02 Mar 2011 13:36:22 -0800
From:	Yinghai Lu <yinghai@...nel.org>
To:	David Rientjes <rientjes@...gle.com>
CC:	Tejun Heo <tj@...nel.org>, Ingo Molnar <mingo@...e.hu>,
	tglx@...utronix.de, "H. Peter Anvin" <hpa@...or.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH x86/mm UPDATED] x86-64, NUMA: Fix distance table handling

On 03/02/2011 01:12 PM, Yinghai Lu wrote:
> On 03/02/2011 07:42 AM, Tejun Heo wrote:
>> Hey,
>>
>> On Wed, Mar 02, 2011 at 06:30:59AM -0800, David Rientjes wrote:
>>> Acked-by: David Rientjes <rientjes@...gle.com>
>>>
>>> There's also this in numa_emulation() that isn't a safe assumption:
>>>
>>>         /* make sure all emulated nodes are mapped to a physical node */
>>>         for (i = 0; i < ARRAY_SIZE(emu_nid_to_phys); i++)
>>>                 if (emu_nid_to_phys[i] == NUMA_NO_NODE)
>>>                         emu_nid_to_phys[i] = 0;
>>>
>>> Node id 0 is not always online depending on how you setup your SRAT.  I'm 
>>> not sure why emu_nid_to_phys[] would ever map a fake node id that doesn't 
>>> exist to a physical node id rather than NUMA_NO_NODE, so I think it can 
>>> just be removed.  Otherwise, it should be mapped to a physical node id 
>>> that is known to be online.
>>
>> Unless I screwed up, that behavior isn't new.  It just put in a
>> different form.  Looking through the code... Okay, I think node 0
>> always exists.  SRAT PXM isn't used as node number directly.  It goes
>> through acpi_map_pxm_to_node() which allocates nids from 0 up.
>> amdtopology also guarantees the existence of node 0, so I think we're
>> in the safe and that probably is the reason why we had the above
>> behavior in the first place.
>>
>> IIRC, there are other places which assume the existence of node 0.
>> Whether it's a good idea or not, I'm not sure but requring node 0 to
>> be always allocated doesn't sound too wrong to me.  Maybe we can add
>> BUG_ON() if node 0 is offline somewhere.
> 
> 
> When first socket does not have memory, we will not node 0 online.
> and cpu_to_node() will have those cpus round to near node like node1 or node7.
> 
> BTW: this conf get broken several times, and get fixed several times.

david,

it looks like numa emu does not support that conf already.

old code:
void __cpuinit numa_add_cpu(int cpu)
{
        unsigned long addr;
        u16 apicid;
        int physnid;
        int nid = NUMA_NO_NODE;

        apicid = early_per_cpu(x86_cpu_to_apicid, cpu);
        if (apicid != BAD_APICID)
                nid = apicid_to_node[apicid];
        if (nid == NUMA_NO_NODE)
                nid = early_cpu_to_node(cpu);
        BUG_ON(nid == NUMA_NO_NODE || !node_online(nid));


current code:
void __cpuinit numa_add_cpu(int cpu)
{
        int physnid, nid;

        nid = numa_cpu_node(cpu);
        if (nid == NUMA_NO_NODE)
                nid = early_cpu_to_node(cpu);
        BUG_ON(nid == NUMA_NO_NODE || !node_online(nid));

        physnid = emu_nid_to_phys[nid];

        /*
         * Map the cpu to each emulated node that is allocated on the physical
         * node of the cpu's apic id.
         */
        for_each_online_node(nid)
                if (emu_nid_to_phys[nid] == physnid)
                        cpumask_set_cpu(cpu, node_to_cpumask_map[nid]);
}


please note numa_cpu_node or old code will return nid that is node 0, and even node0 does not mem and not onlined.

maybe we can just change to nid = cpu_to_node() to get nodeid that is onlined.

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ