[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <035be759-3adc-d224-43f8-1888822f6492@loongson.cn>
Date: Mon, 14 Oct 2024 17:12:26 +0800
From: maobibo <maobibo@...ngson.cn>
To: Huacai Chen <chenhuacai@...nel.org>
Cc: loongarch@...ts.linux.dev, linux-kernel@...r.kernel.org,
lixianglai@...ngson.cn, WANG Xuerui <kernel@...0n.name>
Subject: Re: [PATCH] LoongArch: Fix cpu hotplug issue
On 2024/10/14 下午4:23, Huacai Chen wrote:
> On Mon, Oct 14, 2024 at 4:01 PM maobibo <maobibo@...ngson.cn> wrote:
>>
>> Huacai,
>>
>> On 2024/10/14 下午3:39, Huacai Chen wrote:
>>> On Mon, Oct 14, 2024 at 3:21 PM maobibo <maobibo@...ngson.cn> wrote:
>>>>
>>>> Huacai,
>>>>
>>>> On 2024/10/14 下午3:05, Huacai Chen wrote:
>>>>> Hi, Bibo,
>>>>>
>>>>> I'm a little confused, so please correct me if I'm wrong.
>>>>>
>>>>> On Mon, Oct 14, 2024 at 2:33 PM Bibo Mao <maobibo@...ngson.cn> wrote:
>>>>>>
>>>>>> On LoongArch system, there are two places to set cpu numa node. One
>>>>>> is in arch specified function smp_prepare_boot_cpu(), the other is
>>>>>> in generic function early_numa_node_init(). The latter will overwrite
>>>>>> the numa node information.
>>>>>>
>>>>>> However for hot-added cpu, cpu_logical_map() fails to its physical
>>>>>> cpuid at beginning since it is not enabled in ACPI MADT table. So
>>>>>> function early_cpu_to_node() also fails to get its numa node for
>>>>>> hot-added cpu, and generic function early_numa_node_init() will
>>>>>> overwrite incorrect numa node.
>>>>> For hot-added cpus, we will call acpi_map_cpu() -->
>>>>> acpi_map_cpu2node() --> set_cpuid_to_node(), and set_cpuid_to_node()
>>>>> operates on __cpuid_to_node[]. So I think early_cpu_to_node() should
>>>>> be correct?
>>>>
>>>> __cpuid_to_node[] is correct which is physical cpuid to numa node,
>>>> however cpu_logical_map(cpu) is not set. It fails to get physical cpuid
>>>> from logic cpu.
>>>>
>>>> int early_cpu_to_node(int cpu)
>>>> {
>>>> int physid = cpu_logical_map(cpu);
>>>>
>>>> <<<<<<<<<<< Here physid is -1.
>>> early_cpu_to_node() is not supposed to be called after boot, and if it
>> Which calls early_cpu_to_node() after boot?
>>
>>> is really needed, I think a better solution is:
>>>
>>> diff --git a/arch/loongarch/kernel/acpi.c b/arch/loongarch/kernel/acpi.c
>>> index f1a74b80f22c..998cf45fd3b7 100644
>>> --- a/arch/loongarch/kernel/acpi.c
>>> +++ b/arch/loongarch/kernel/acpi.c
>>> @@ -311,6 +311,8 @@ static int __ref acpi_map_cpu2node(acpi_handle
>>> handle, int cpu, int physid)
>>>
>>> nid = acpi_get_node(handle);
>>> if (nid != NUMA_NO_NODE) {
>>> + __cpu_number_map[physid] = cpu;
>>> + __cpu_logical_map[cpu] = physid;
>> This does not solve the problem. The above has been done in function
>> cpu = set_processor_mask(physid, ACPI_MADT_ENABLED);
>>
>> static int set_processor_mask(u32 id, u32 flags)
>> {
>> ...
>> if (flags & ACPI_MADT_ENABLED) {
>> num_processors++;
>> set_cpu_present(cpu, true);
>> __cpu_number_map[cpuid] = cpu;
>> __cpu_logical_map[cpu] = cpuid;
>> }
>>
>> The problem is that
>> smp_prepare_boot_cpu(); /* arch-specific boot-cpu hooks */
>> <<<<<<<<<<<<<<<<
>> set_cpu_numa_node() is called in function smp_prepare_boot_cpu()
>>
>> early_numa_node_init();
>>
>> static void __init early_numa_node_init(void)
>> {
>> #ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
>> #ifndef cpu_to_node
>> int cpu;
>>
>> /* The early_cpu_to_node() should be ready here. */
>> for_each_possible_cpu(cpu)
>> set_cpu_numa_node(cpu, early_cpu_to_node(cpu));
>> <<<<<<<<<<<<<<<<
>> * however here early_cpu_to_node is -1, so that cpu_to_node(cpu) will
>> always return -1 in late. *, which causes cpu hotadd problem.
> Still confused. For ACPI_MADT_ENABLED cpus, everything is right after
> early_numa_node_init(). For !ACPI_MADT_ENABLED cpus, cpu_to_node()
> returns -1 after early_numa_node_init() and before hot-add, but if
> acpi_map_cpu() do things right, cpu_to_node() should still work well
> after hot-add.
yes, if "_PXM" information for hot-add cpu handle exist, it works well.
However if "_PXM" information does not exist, it falls back to legacy
method from smp_prepare_boot_cpu(). However cpu_numa_node information is
overwritten with -1 by later function early_numa_node_init().
Regards
Bibo Mao
>
> Huacai
>>
>> Regards
>> Bibo Mao
>>
>>
>>> set_cpuid_to_node(physid, nid);
>>> node_set(nid, numa_nodes_parsed);
>>> set_cpu_numa_node(cpu, nid);
>>>
>>> Huacai
>>>
>>>>
>>>> if (physid < 0)
>>>> return NUMA_NO_NODE;
>>>>
>>>> return __cpuid_to_node[physid];
>>>> }
>>>>
>>>> Regards
>>>> Bibo Mao
>>>>>
>>>>> Huacai
>>>>>
>>>>>>
>>>>>> Here static array __cpu_to_node and api set_early_cpu_to_node()
>>>>>> is added, so that early_cpu_to_node is consistent with function
>>>>>> cpu_to_node() for hot-added cpu.
>>>>>>
>>>>>> Signed-off-by: Bibo Mao <maobibo@...ngson.cn>
>>>>>> ---
>>>>>> arch/loongarch/include/asm/numa.h | 2 ++
>>>>>> arch/loongarch/kernel/numa.c | 10 +++++++++-
>>>>>> arch/loongarch/kernel/smp.c | 1 +
>>>>>> 3 files changed, 12 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/arch/loongarch/include/asm/numa.h b/arch/loongarch/include/asm/numa.h
>>>>>> index b5f9de9f102e..e8e6fcfb006a 100644
>>>>>> --- a/arch/loongarch/include/asm/numa.h
>>>>>> +++ b/arch/loongarch/include/asm/numa.h
>>>>>> @@ -50,6 +50,7 @@ static inline void set_cpuid_to_node(int cpuid, s16 node)
>>>>>> }
>>>>>>
>>>>>> extern int early_cpu_to_node(int cpu);
>>>>>> +extern void set_early_cpu_to_node(int cpu, s16 node);
>>>>>>
>>>>>> #else
>>>>>>
>>>>>> @@ -57,6 +58,7 @@ static inline void early_numa_add_cpu(int cpuid, s16 node) { }
>>>>>> static inline void numa_add_cpu(unsigned int cpu) { }
>>>>>> static inline void numa_remove_cpu(unsigned int cpu) { }
>>>>>> static inline void set_cpuid_to_node(int cpuid, s16 node) { }
>>>>>> +static inline void set_early_cpu_to_node(int cpu, s16 node) { }
>>>>>>
>>>>>> static inline int early_cpu_to_node(int cpu)
>>>>>> {
>>>>>> diff --git a/arch/loongarch/kernel/numa.c b/arch/loongarch/kernel/numa.c
>>>>>> index 84fe7f854820..62508aace644 100644
>>>>>> --- a/arch/loongarch/kernel/numa.c
>>>>>> +++ b/arch/loongarch/kernel/numa.c
>>>>>> @@ -34,6 +34,9 @@ static struct numa_meminfo numa_meminfo;
>>>>>> cpumask_t cpus_on_node[MAX_NUMNODES];
>>>>>> cpumask_t phys_cpus_on_node[MAX_NUMNODES];
>>>>>> EXPORT_SYMBOL(cpus_on_node);
>>>>>> +static s16 __cpu_to_node[NR_CPUS] = {
>>>>>> + [0 ... CONFIG_NR_CPUS - 1] = NUMA_NO_NODE
>>>>>> +};
>>>>>>
>>>>>> /*
>>>>>> * apicid, cpu, node mappings
>>>>>> @@ -117,11 +120,16 @@ int early_cpu_to_node(int cpu)
>>>>>> int physid = cpu_logical_map(cpu);
>>>>>>
>>>>>> if (physid < 0)
>>>>>> - return NUMA_NO_NODE;
>>>>>> + return __cpu_to_node[cpu];
>>>>>>
>>>>>> return __cpuid_to_node[physid];
>>>>>> }
>>>>>>
>>>>>> +void set_early_cpu_to_node(int cpu, s16 node)
>>>>>> +{
>>>>>> + __cpu_to_node[cpu] = node;
>>>>>> +}
>>>>>> +
>>>>>> void __init early_numa_add_cpu(int cpuid, s16 node)
>>>>>> {
>>>>>> int cpu = __cpu_number_map[cpuid];
>>>>>> diff --git a/arch/loongarch/kernel/smp.c b/arch/loongarch/kernel/smp.c
>>>>>> index 9afc2d8b3414..998668be858c 100644
>>>>>> --- a/arch/loongarch/kernel/smp.c
>>>>>> +++ b/arch/loongarch/kernel/smp.c
>>>>>> @@ -512,6 +512,7 @@ void __init smp_prepare_boot_cpu(void)
>>>>>> set_cpu_numa_node(cpu, node);
>>>>>> else {
>>>>>> set_cpu_numa_node(cpu, rr_node);
>>>>>> + set_early_cpu_to_node(cpu, rr_node);
>>>>>> rr_node = next_node_in(rr_node, node_online_map);
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> base-commit: 6485cf5ea253d40d507cd71253c9568c5470cd27
>>>>>> --
>>>>>> 2.39.3
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>
Powered by blists - more mailing lists