[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <740cae36-f1a9-4d20-e4ea-3100076de533@huawei.com>
Date: Fri, 30 Aug 2019 16:08:14 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Michal Hocko <mhocko@...nel.org>
CC: <will@...nel.org>, <akpm@...ux-foundation.org>,
<rppt@...ux.ibm.com>, <anshuman.khandual@....com>,
<adobriyan@...il.com>, <cai@....pw>, <robin.murphy@....com>,
<tglx@...utronix.de>, <linux-arm-kernel@...ts.infradead.org>,
<linux-kernel@...r.kernel.org>, <linuxarm@...wei.com>
Subject: Re: [PATCH] arm64: numa: check the node id before accessing
node_to_cpumask_map
On 2019/8/30 14:44, Michal Hocko wrote:
> On Fri 30-08-19 14:35:26, Yunsheng Lin wrote:
>> On 2019/8/30 13:55, Michal Hocko wrote:
>>> On Fri 30-08-19 10:26:31, Yunsheng Lin wrote:
>>>> Some buggy bios may not set the device' numa id, and dev_to_node
>>>> will return -1, which may cause global-out-of-bounds error
>>>> detected by KASAN.
>>>
>>> Why should we workaround a buggy bios like that? Is it so widespread and
>>> no BIOS update available? Also, why is this arm64 specific?
>>
>> For our case, there is BIOS update available. I just thought it might
>> be better to protect from this case when BIOS has not implemented the
>> device' numa id setting feature or the feature from BIOS has some bug.
>>
>> It is not arm64 specific, right now I only have arm64 board. If it is
>> ok to protect this from the buggy BIOS, maybe all other arch can be
>> changed too.
>
> If we are to really care then this should be consistent among
> architectures IMHO. But I am not really sure this is really worth it.
> The code is quite old and I do not really remember any reports.
It is only detected by enabling KASAN, the system seems to run fine without
any visible error if KASAN is disabled. Maybe there is why no report has
been seen?
Also according to Section 6.2.14 from ACPI spec 6.3 [1], the setting of proximity
domain is optional, as below:
This optional object is used to describe proximity domain
associations within a machine. _PXM evaluates to an integer
that identifies a device as belonging to a Proximity Domain
defined in the System Resource Affinity Table (SRAT).
Do you think it is ok to resend the fix with above clarification and below
log:
[ 42.970381] ==================================================================
[ 42.977595] BUG: KASAN: global-out-of-bounds in __bitmap_weight+0x48/0xb0
[ 42.984370] Read of size 8 at addr ffff20008cdf8790 by task kworker/0:1/13
[ 42.991230]
[ 42.992712] CPU: 0 PID: 13 Comm: kworker/0:1 Tainted: G O 5.2.0-rc4-g8bde06a-dirty #3
[ 43.001830] Hardware name: Huawei TaiShan 2280 V2/BC82AMDA, BIOS TA BIOS 2280-A CS V2.B050.01 08/08/2019
[ 43.011298] Workqueue: events work_for_cpu_fn
[ 43.015643] Call trace:
[ 43.018078] dump_backtrace+0x0/0x1e8
[ 43.021727] show_stack+0x14/0x20
[ 43.025031] dump_stack+0xc4/0xfc
[ 43.028335] print_address_description+0x178/0x270
[ 43.033113] __kasan_report+0x164/0x1b8
[ 43.036936] kasan_report+0xc/0x18
[ 43.040325] __asan_load8+0x84/0xa8
[ 43.043801] __bitmap_weight+0x48/0xb0
[ 43.047552] hclge_init_ae_dev+0x988/0x1e78 [hclge]
[ 43.052418] hnae3_register_ae_dev+0xcc/0x278 [hnae3]
[ 43.057467] hns3_probe+0xe0/0x120 [hns3]
[ 43.061464] local_pci_probe+0x74/0xf0
[ 43.065200] work_for_cpu_fn+0x2c/0x48
[ 43.068937] process_one_work+0x3c0/0x878
[ 43.072934] worker_thread+0x400/0x670
[ 43.076670] kthread+0x1b0/0x1b8
[ 43.079885] ret_from_fork+0x10/0x18
[ 43.083446]
[ 43.084925] The buggy address belongs to the variable:
[ 43.090052] numa_distance+0x30/0x40
[ 43.093613]
[ 43.095091] Memory state around the buggy address:
[ 43.099870] ffff20008cdf8680: fa fa fa fa 04 fa fa fa fa fa fa fa 00 00 fa fa
[ 43.107078] ffff20008cdf8700: fa fa fa fa 04 fa fa fa fa fa fa fa 00 fa fa fa
[ 43.114286] >ffff20008cdf8780: fa fa fa fa 00 00 00 00 00 00 00 00 fa fa fa fa
[ 43.121494] ^
[ 43.125230] ffff20008cdf8800: 01 fa fa fa fa fa fa fa 04 fa fa fa fa fa fa fa
[ 43.132439] ffff20008cdf8880: fa fa fa fa fa fa fa fa 00 00 fa fa fa fa fa fa
[ 43.139646] ==================================================================
>
>>>> This patch changes cpumask_of_node to return cpu_none_mask if the
>>>> node is not valid, and sync the cpumask_of_node between the
>>>> cpumask_of_node function in numa.h and numa.c.
>>>
>>> Why?
>>
>> When CONFIG_DEBUG_PER_CPU_MAPS is defined, the cpumask_of_node() in
>> numa.c is used, if not, the cpumask_of_node() in numa.h is used.
>>
>> I am not sure why there is difference between them, and it is there
>> when since the below commit:
>> 1a2db300348b ("arm64, numa: Add NUMA support for arm64 platforms.")
>>
>> I synced them to keep them consistent whether CONFIG_DEBUG_PER_CPU_MAPS
>> is defined.
>
> Such a change should be made in a separate patch with a full
> clarification/justification. From the above it is still not clear why
> this is needed though.
Ok.
How about:
Currently there are different implementations of cpumask_of_node() depend
on the arch, for example:
ia64:
#define cpumask_of_node(node) ((node) == -1 ? \
cpu_all_mask : \
&node_to_cpu_mask[node])
alpha:
static const struct cpumask *cpumask_of_node(int node)
{
int cpu;
if (node == NUMA_NO_NODE)
return cpu_all_mask;
cpumask_clear(&node_to_cpumask_map[node]);
for_each_online_cpu(cpu) {
if (cpu_to_node(cpu) == node)
cpumask_set_cpu(cpu, node_to_cpumask_map[node]);
}
return &node_to_cpumask_map[node];
}
Even for the same arch, there are two implementations depend on the
CONFIG_DEBUG_PER_CPU_MAPS configuration.
arm64/x86 without CONFIG_DEBUG_PER_CPU_MAPS:
static inline const struct cpumask *cpumask_of_node(int node)
{
return node_to_cpumask_map[node];
}
arm64/x86 with CONFIG_DEBUG_PER_CPU_MAPS:
const struct cpumask *cpumask_of_node(int node)
{
if (WARN_ON(node >= nr_node_ids))
return cpu_none_mask;
if (WARN_ON(node_to_cpumask_map[node] == NULL))
return cpu_online_mask;
return node_to_cpumask_map[node];
}
It seems the cpumask_of_node with CONFIG_DEBUG_PER_CPU_MAPS is used
to catch the erorr case and give a warning to user when node id is not
valid.
So in order to be consistent between different arch and with or
without CONFIG_DEBUG_PER_CPU_MAPS case:
node id has to be checked with the below case before returning
node_to_cpumask_map[node]:
1. if nr_node_ids < 0, return cpu_all_mask
2. if nr_node_ids >= nr_node_ids, return cpu_none_mask
3. if node_to_cpumask_map[node] is NULL, return cpu_online_mask
[1] https://uefi.org/sites/default/files/resources/ACPI_6_3_final_Jan30.pdf
>
>>>> Signed-off-by: Yunsheng Lin <linyunsheng@...wei.com>
>>>> ---
>>>> arch/arm64/include/asm/numa.h | 6 ++++++
>>>> arch/arm64/mm/numa.c | 2 +-
>>>> 2 files changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/arm64/include/asm/numa.h b/arch/arm64/include/asm/numa.h
>>>> index 626ad01..da891ed 100644
>>>> --- a/arch/arm64/include/asm/numa.h
>>>> +++ b/arch/arm64/include/asm/numa.h
>>>> @@ -25,6 +25,12 @@ const struct cpumask *cpumask_of_node(int node);
>>>> /* Returns a pointer to the cpumask of CPUs on Node 'node'. */
>>>> static inline const struct cpumask *cpumask_of_node(int node)
>>>> {
>>>> + if (node >= nr_node_ids || node < 0)
>>>> + return cpu_none_mask;
>>>> +
>>>> + if (!node_to_cpumask_map[node])
>>>> + return cpu_online_mask;
>>>> +
>>>> return node_to_cpumask_map[node];
>>>> }
>>>> #endif
>>>> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
>>>> index 4f241cc..3846313 100644
>>>> --- a/arch/arm64/mm/numa.c
>>>> +++ b/arch/arm64/mm/numa.c
>>>> @@ -46,7 +46,7 @@ EXPORT_SYMBOL(node_to_cpumask_map);
>>>> */
>>>> const struct cpumask *cpumask_of_node(int node)
>>>> {
>>>> - if (WARN_ON(node >= nr_node_ids))
>>>> + if (WARN_ON(node >= nr_node_ids || node < 0))
>>>> return cpu_none_mask;
>>>>
>>>> if (WARN_ON(node_to_cpumask_map[node] == NULL))
>>>> --
>>>> 2.8.1
>>>
>
Powered by blists - more mailing lists