[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <lzt657mqtw5ga2eaz6ou3bb4q3wxquc52ist34k5dn5rxo54av@3hr55qnyouyz>
Date: Tue, 27 Aug 2024 08:45:44 +0200
From: Maciej Wieczor-Retman <maciej.wieczor-retman@...el.com>
To: Reinette Chatre <reinette.chatre@...el.com>
CC: <fenghua.yu@...el.com>, <shuah@...nel.org>,
<linux-kselftest@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<ilpo.jarvinen@...ux.intel.com>, <tony.luck@...el.com>
Subject: Re: [PATCH v4 1/2] selftests/resctrl: Adjust effective L3 cache size
with SNC enabled
Hi, thanks for the review,
On 2024-08-12 at 16:40:00 -0700, Reinette Chatre wrote:
>Hi Maciej,
>
>On 7/12/24 2:04 AM, Maciej Wieczor-Retman wrote:
>> Sub-NUMA Cluster divides CPUs sharing an L3 cache into separate NUMA
>> nodes. Systems may support splitting into either two, three or four
>> nodes.
>>
>> When SNC mode is enabled the effective amount of L3 cache available
>> for allocation is divided by the number of nodes per L3.
>>
>> Detect which SNC mode is active by comparing the number of CPUs
>> that share a cache with CPU0, with the number of CPUs on node0.
>>
>> Signed-off-by: Tony Luck <tony.luck@...el.com>
>> Co-developed-by: Maciej Wieczor-Retman <maciej.wieczor-retman@...el.com>
>> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@...el.com>
>
>Since you are "From:" author there is no need for a "Co-developed-by"
>for you. Tony does need one. Please check: "When to use Acked-by:, Cc:,
>and Co-developed-by:" in Documentation/process/submitting-patches.rst
>(checkpatch.pl also warns about this).
Thanks, I changed patch's author at some point and I think I forgot to change
the tags.
>
>> ---
>> Changelog v4:
>> - Make returned value a static local variable so the function only runs
>> the logic once. (Reinette)
>>
>> Changelog v3:
>> - Add comparison between present and online cpus to test if the
>> calculated SNC mode is credible. (Reinette)
>> - Added comment to cache size modification to better explain why it is
>> needed there. (Reinette)
>> - Fix facts in patch message. (Reinette)
>> - Change snc_ways() to snc_nodes_per_l3_cache(). (Reinette)
>>
>> tools/testing/selftests/resctrl/resctrl.h | 4 ++
>> tools/testing/selftests/resctrl/resctrlfs.c | 73 +++++++++++++++++++++
>> 2 files changed, 77 insertions(+)
>>
>> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
>> index 2dda56084588..851b37c9c38a 100644
>> --- a/tools/testing/selftests/resctrl/resctrl.h
>> +++ b/tools/testing/selftests/resctrl/resctrl.h
>> @@ -11,6 +11,7 @@
>> #include <signal.h>
>> #include <dirent.h>
>> #include <stdbool.h>
>> +#include <ctype.h>
>> #include <sys/stat.h>
>> #include <sys/ioctl.h>
>> #include <sys/mount.h>
>> @@ -43,6 +44,8 @@
>> #define DEFAULT_SPAN (250 * MB)
>> +#define MAX_SNC 4
>> +
>> /*
>> * user_params: User supplied parameters
>> * @cpu: CPU number to which the benchmark will be bound to
>> @@ -120,6 +123,7 @@ extern volatile int *value_sink;
>> extern char llc_occup_path[1024];
>> +int snc_nodes_per_l3_cache(void);
>> int get_vendor(void);
>> bool check_resctrlfs_support(void);
>> int filter_dmesg(void);
>> diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
>> index 250c320349a7..803dd415984c 100644
>> --- a/tools/testing/selftests/resctrl/resctrlfs.c
>> +++ b/tools/testing/selftests/resctrl/resctrlfs.c
>> @@ -156,6 +156,68 @@ int get_domain_id(const char *resource, int cpu_no, int *domain_id)
>> return 0;
>> }
>> +/*
>> + * Count number of CPUs in a /sys bit map
>
>"bit map" -> "bitmap"
Will do.
>
>> + */
>> +static unsigned int count_sys_bitmap_bits(char *name)
>> +{
>> + FILE *fp = fopen(name, "r");
>> + int count = 0, c;
>> +
>> + if (!fp)
>> + return 0;
>> +
>> + while ((c = fgetc(fp)) != EOF) {
>> + if (!isxdigit(c))
>> + continue;
>> + switch (c) {
>> + case 'f':
>> + count++;
>> + case '7': case 'b': case 'd': case 'e':
>> + count++;
>> + case '3': case '5': case '6': case '9': case 'a': case 'c':
>> + count++;
>> + case '1': case '2': case '4': case '8':
>> + count++;
>> + }
>> + }
>> + fclose(fp);
>> +
>> + return count;
>> +}
>> +
>> +/*
>> + * Detect SNC by comparing #CPUs in node0 with #CPUs sharing LLC with CPU0.
>> + * If some CPUs are offline the numbers may not be exact multiples of each
>> + * other. Any offline CPUs on node0 will be also gone from shared_cpu_map of
>> + * CPU0 but offline CPUs from other nodes will only make the cache_cpus value
>> + * lower. Still try to get the ratio right by preventing the second possibility.
>
>This all seems unnecessary since the next patch sets snc_mode to 1 if there
>are any offline CPUs.
>Seems more appropriate to move the offline CPU handling to this patch.
Okay, I'll move it here.
>
>> + */
>> +int snc_nodes_per_l3_cache(void)
>> +{
>> + int node_cpus, cache_cpus, i;
>> + static int snc_mode;
>> +
>> + if (!snc_mode) {
>> + node_cpus = count_sys_bitmap_bits("/sys/devices/system/node/node0/cpumap");
>> + cache_cpus = count_sys_bitmap_bits("/sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_map");
>> +
>> + if (!node_cpus || !cache_cpus) {
>> + ksft_print_msg("Could not determine Sub-NUMA Cluster mode.\n");
>> + return 1;
>> + }
>> +
>> + for (i = 1; i <= MAX_SNC ; i++) {
>
>(nit: unnecessary space)
Will fix.
>
>> + if (i * node_cpus >= cache_cpus) {
>> + snc_mode = i;
>> + break;
>> + }
>
>This is irrelevant after the subsequent patch but note that there are scenarios
>where above loop cannot set snc_mode and the function will thus return 0 since
>snc_mode is not initialized. This is a problem in division done by following hunk.
Oh right, I'll set initial value to 1.
>
>> + }
>> + }
>> +
>> + return snc_mode;
>> +}
>> +
>> /*
>> * get_cache_size - Get cache size for a specified CPU
>> * @cpu_no: CPU number
>> @@ -211,6 +273,17 @@ int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size
>> break;
>> }
>> + /*
>> + * The amount of cache represented by each bit in the masks
>> + * in the schemata file is reduced by a factor equal to SNC
>> + * nodes per L3 cache.
>> + * E.g. on a SNC-2 system with a 100MB L3 cache a test that
>> + * allocates memory from its local SNC node (default behavior
>> + * without using libnuma) will only see 50 MB llc_occupancy
>> + * with a fully populated L3 mask in the schemata file.
>> + */
>> + if (cache_num == 3)
>> + *cache_size /= snc_nodes_per_l3_cache();
>> return 0;
>> }
>
>Reinette
--
Kind regards
Maciej Wieczór-Retman
Powered by blists - more mailing lists