[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <00ec47f1-194b-4d85-8c8b-3200b918e1d3@nvidia.com>
Date: Fri, 5 Dec 2025 10:53:00 -0800
From: Fenghua Yu <fenghuay@...dia.com>
To: Xiaochen Shen <shenxiaochen@...n-hieco.net>, tony.luck@...el.com,
reinette.chatre@...el.com, bp@...en8.de, shuah@...nel.org,
skhan@...uxfoundation.org
Cc: babu.moger@....com, james.morse@....com, Dave.Martin@....com,
x86@...nel.org, linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org
Subject: Re: [PATCH v2 2/3] selftests/resctrl: Fix a division by zero error on
Hygon
Hi, Xiaochen,
On 12/5/25 01:25, Xiaochen Shen wrote:
> Commit
>
> a1cd99e700ec ("selftests/resctrl: Adjust effective L3 cache size with SNC enabled")
>
> introduced the snc_nodes_per_l3_cache() function to detect the Intel
> Sub-NUMA Clustering (SNC) feature by comparing #CPUs in node0 with #CPUs
> sharing LLC with CPU0. The function was designed to return:
> (1) >1: SNC mode is enabled.
> (2) 1: SNC mode is not enabled or not supported.
>
> However, on certain Hygon CPUs, #CPUs sharing LLC with CPU0 is actually
> less than #CPUs in node0. This results in snc_nodes_per_l3_cache()
> returning 0 (calculated as cache_cpus / node_cpus).
>
> This leads to a division by zero error in get_cache_size():
> *cache_size /= snc_nodes_per_l3_cache();
>
> Causing the resctrl selftest to fail with:
> "Floating point exception (core dumped)"
>
> Fix the issue by ensuring snc_nodes_per_l3_cache() returns 1 when SNC
> mode is not supported on the platform.
>
> Fixes: a1cd99e700ec ("selftests/resctrl: Adjust effective L3 cache size with SNC enabled")
> Signed-off-by: Xiaochen Shen <shenxiaochen@...n-hieco.net>
> Reviewed-by: Reinette Chatre <reinette.chatre@...el.com>
> ---
> tools/testing/selftests/resctrl/resctrlfs.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
> index 195f04c4d158..2b075e7334bf 100644
> --- a/tools/testing/selftests/resctrl/resctrlfs.c
> +++ b/tools/testing/selftests/resctrl/resctrlfs.c
> @@ -243,6 +243,16 @@ int snc_nodes_per_l3_cache(void)
> }
> snc_mode = cache_cpus / node_cpus;
>
> + /*
> + * On certain Hygon platforms:
nit. This situation could happen on other platforms than Hygon. Maybe
it's better to have a more generic comment here?
* On some platforms (e.g. Hygon),
Reviewed-by: Fenghua Yu <fenghuay@...dia.com>
> + * cache_cpus < node_cpus, the calculated snc_mode is 0.
> + *
> + * Set snc_mode = 1 to indicate that SNC mode is not
> + * supported on the platform.
> + */
> + if (!snc_mode)
> + snc_mode = 1;
> +
> if (snc_mode > 1)
> ksft_print_msg("SNC-%d mode discovered.\n", snc_mode);
> }
Thanks.
-Fenghua
Powered by blists - more mailing lists