[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZOjrYB1UgtRchbXK@agluck-desk3>
Date: Fri, 25 Aug 2023 10:56:48 -0700
From: Tony Luck <tony.luck@...el.com>
To: Reinette Chatre <reinette.chatre@...el.com>
Cc: Fenghua Yu <fenghua.yu@...el.com>,
Peter Newman <peternewman@...gle.com>,
Jonathan Corbet <corbet@....net>,
Shuah Khan <skhan@...uxfoundation.org>, x86@...nel.org,
Shaopeng Tan <tan.shaopeng@...itsu.com>,
James Morse <james.morse@....com>,
Jamie Iles <quic_jiles@...cinc.com>,
Babu Moger <babu.moger@....com>,
Randy Dunlap <rdunlap@...radead.org>,
linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
patches@...ts.linux.dev
Subject: Re: [PATCH v4 7/7] selftests/resctrl: Adjust effective L3 cache size
when SNC enabled
On Fri, Aug 11, 2023 at 10:33:43AM -0700, Reinette Chatre wrote:
> Hi Tony,
>
> On 7/22/2023 12:07 PM, Tony Luck wrote:
> > Sub-NUMA Cluster divides CPUs sharing an L3 cache into separate NUMA
> > nodes. Systems may support splitting into either two or four nodes.
> >
> > When SNC mode is enabled the effective amount of L3 cache available
> > for allocation is divided by the number of nodes per L3.
> >
> > Detect which SNC mode is active by comparing the number of CPUs
> > that share a cache with CPU0, with the number of CPUs on node0.
> >
> > Reported-by: "Shaopeng Tan (Fujitsu)" <tan.shaopeng@...itsu.com>
> > Closes: https://lore.kernel.org/r/TYAPR01MB6330B9B17686EF426D2C3F308B25A@TYAPR01MB6330.jpnprd01.prod.outlook.com
>
> This does not seem to be the case when looking at
> https://lore.kernel.org/all/TYAPR01MB6330A4EB3633B791939EA45E8B39A@TYAPR01MB6330.jpnprd01.prod.outlook.com/
Correct. I'll drop the "Closes:" tag. I'm not sure what
the status is. Shaopeng didn't respond to my suggestion
to try "taskset(1)" when running the tests to check if
NUMA effects are causing the test to fail.
>
> > Signed-off-by: Tony Luck <tony.luck@...el.com>
> > ---
> > tools/testing/selftests/resctrl/resctrl.h | 1 +
> > tools/testing/selftests/resctrl/resctrlfs.c | 57 +++++++++++++++++++++
> > 2 files changed, 58 insertions(+)
> >
> > diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> > index 87e39456dee0..a8b43210b573 100644
> > --- a/tools/testing/selftests/resctrl/resctrl.h
> > +++ b/tools/testing/selftests/resctrl/resctrl.h
> > @@ -13,6 +13,7 @@
> > #include <signal.h>
> > #include <dirent.h>
> > #include <stdbool.h>
> > +#include <ctype.h>
> > #include <sys/stat.h>
> > #include <sys/ioctl.h>
> > #include <sys/mount.h>
> > diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
> > index fb00245dee92..79eecbf9f863 100644
> > --- a/tools/testing/selftests/resctrl/resctrlfs.c
> > +++ b/tools/testing/selftests/resctrl/resctrlfs.c
> > @@ -130,6 +130,61 @@ int get_resource_id(int cpu_no, int *resource_id)
> > return 0;
> > }
> >
> > +/*
> > + * Count number of CPUs in a /sys bit map
> > + */
> > +static int count_sys_bitmap_bits(char *name)
> > +{
> > + FILE *fp = fopen(name, "r");
> > + int count = 0, c;
> > +
> > + if (!fp)
> > + return 0;
> > +
> > + while ((c = fgetc(fp)) != EOF) {
> > + if (!isxdigit(c))
> > + continue;
> > + switch (c) {
> > + case 'f':
> > + count++;
> > + case '7': case 'b': case 'd': case 'e':
> > + count++;
> > + case '3': case '5': case '6': case '9': case 'a': case 'c':
> > + count++;
> > + case '1': case '2': case '4': case '8':
> > + count++;
> > + }
> > + }
> > + fclose(fp);
> > +
> > + return count;
> > +}
> > +
> > +/*
> > + * Detect SNC by compating #CPUs in node0 with #CPUs sharing LLC with CPU0
> > + * Try to get this right, even if a few CPUs are offline so that the number
> > + * of CPUs in node0 is not exactly half or a quarter of the CPUs sharing the
> > + * LLC of CPU0.
> > + */
> > +static int snc_ways(void)
> > +{
> > + int node_cpus, cache_cpus;
> > +
> > + node_cpus = count_sys_bitmap_bits("/sys/devices/system/node/node0/cpumap");
> > + cache_cpus = count_sys_bitmap_bits("/sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_map");
> > +
> > + if (!node_cpus || !cache_cpus) {
> > + fprintf(stderr, "Warning could not determine Sub-NUMA Cluster mode\n");
> > + return 1;
> > + }
> > +
> > + if (4 * node_cpus >= cache_cpus)
> > + return 4;
> > + else if (2 * node_cpus >= cache_cpus)
> > + return 2;
> > + return 1;
> > +}
> > +
> > /*
> > * get_cache_size - Get cache size for a specified CPU
> > * @cpu_no: CPU number
> > @@ -190,6 +245,8 @@ int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size)
> > break;
> > }
> >
> > + if (cache_num == 3)
> > + *cache_size /= snc_ways();
> > return 0;
> > }
> >
>
> I am surprised that this small change is sufficient. The resctrl
> selftests are definitely not NUMA aware and the CAT and CMT tests
> are not taking that into account when picking CPUs to run on. From
> what I understand LLC occupancy counters need to be added in this
> scenario but I do not see that done either.
This is a first step (the tests are definitely going to fail if
they have incorrect information about the cache size).
For a fully reliable set of tests some major surgery will be required
to bind to CPUs and memory to control allocation and access.
>
> Reinette
-Tony
Powered by blists - more mailing lists