[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <TYAPR01MB63307F1C04270DD2D09ACF238B0B9@TYAPR01MB6330.jpnprd01.prod.outlook.com>
Date: Thu, 10 Mar 2022 08:18:50 +0000
From: "tan.shaopeng@...itsu.com" <tan.shaopeng@...itsu.com>
To: 'Shuah Khan' <skhan@...uxfoundation.org>,
Fenghua Yu <fenghua.yu@...el.com>,
Reinette Chatre <reinette.chatre@...el.com>,
Shuah Khan <shuah@...nel.org>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>
Subject: RE: [PATCH v3] selftests/resctrl: Print a message if the result of
MBM&CMT tests is failed on Intel cpu
Hi Shuah,
> On 3/4/22 3:39 AM, Shaopeng Tan wrote:
> > According to "Intel Resource Director Technology (Intel RDT) on 2nd
> > Generation Intel Xeon Scalable Processors Reference Manual", When the
> > Intel Sub-NUMA Clustering(SNC) feature is enabled, Intel CMT and MBM
> > counters may not be accurate.
> >
> > However, there does not seem to be an architectural way to detect if
> > SNC is enabled.
> >
> > If the result of MBM&CMT test fails on Intel CPU, print a message to
> > let users know a possible cause of failure.
> >
> > Signed-off-by: Shaopeng Tan <tan.shaopeng@...fujitsu.com>
> > ---
> > Hello,
> >
> > In PATCH V2, I tried to detect whether SNC is enabled by NUMA info and
> > cpuinfo(socket_num), but it is not reliable and no future-proof.
> >
> > I just print a message to let users know a possible cause of "not ok",
> > When CMT or MBM test runs on Intel CPU, and the result is "not ok".
> >
> > This patch is based on v5.16.
>
> Also need to be rebased on mainline latest
I will rebased on mainline latest in next version.
> >
> > tools/testing/selftests/resctrl/resctrl_tests.c | 12 +++++++++---
> > 1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c
> > b/tools/testing/selftests/resctrl/resctrl_tests.c
> > index 973f09a66e1e..ec2bdce7b85f 100644
> > --- a/tools/testing/selftests/resctrl/resctrl_tests.c
> > +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
> > @@ -14,8 +14,9 @@
> > #define BENCHMARK_ARG_SIZE 64
> >
> > bool is_amd;
> > +bool is_intel;
> >
>
> Why is this a global? I am not seeing a reason. These detect_*()s could be
> moved to resctrl.h and get rid of globals.
>
> Instead of adding intel check to detect_amd() add detect_intel() or is_intel()
> and have ut return true of it detects intel.
"is_amd" and "is_intel" are called many times,
in this way, detect_vendor is called only once.
Do you mean in the following way?
"/proc/cpuinfo" will be open/closed many times,
since is_amd()/is_intel() is called in serval places.
So, I think current is better. Do you have any thoughts.
----
bool detect_vendor_str(const char *vendor_str )
{
FILE *inf = fopen("/proc/cpuinfo", "r");
char *res;
bool found = false;
if (!inf)
return found;
res = fgrep(inf, "vendor_id");
if (res) {
char *s = strchr(res, ':');
found = s && !strcmp(s,vendor_str);
free(res);
}
fclose(inf);
return found;
}
bool is_amd()
{
return detect_vendor_str(": AuthenticAMD\n");
}
bool is_intel()
{
return detect_vendor_str(": GenuineIntel\n");
}
----
> > -void detect_amd(void)
> > +void detect_vendor(void)
> > {
> > FILE *inf = fopen("/proc/cpuinfo", "r");
> > char *res;
> > @@ -29,6 +30,7 @@ void detect_amd(void)
> > char *s = strchr(res, ':');
> >
> > is_amd = s && !strcmp(s, ": AuthenticAMD\n");
> > + is_intel = s && !strcmp(s, ": GenuineIntel\n");
> > free(res);
> > }
> > fclose(inf);
> > @@ -70,6 +72,8 @@ static void run_mbm_test(bool has_ben, char
> **benchmark_cmd, int span,
> > sprintf(benchmark_cmd[5], "%s", MBA_STR);
> > res = mbm_bw_change(span, cpu_no, bw_report, benchmark_cmd);
> > ksft_test_result(!res, "MBM: bw change\n");
> > + if (is_intel && res)
> > + ksft_print_msg("Intel CMT and MBM counters may be
> inaccurate when
> > +Sub-NUMA Clustering (SNC) is enabled. Ensure SNC is disabled in the
> > +BIOS if this system supports SNC.\n");
>
> This message is rather long. Please make it concise.
I will use the following message in next version.
"Intel MBM may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration."
> > mbm_test_cleanup();
> > }
> >
> > @@ -106,6 +110,8 @@ static void run_cmt_test(bool has_ben, char
> **benchmark_cmd, int cpu_no)
> > sprintf(benchmark_cmd[5], "%s", CMT_STR);
> > res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd);
> > ksft_test_result(!res, "CMT: test\n");
> > + if (is_intel && res)
> > + ksft_print_msg("Intel CMT and MBM counters may be
> inaccurate when
> > +Sub-NUMA Clustering (SNC) is enabled. Ensure SNC is disabled in the
> > +BIOS if this system supports SNC.\n");
>
> This message is rather long. Please make it concise.
"Intel CMT may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration."
> > cmt_test_cleanup();
> > }
> >
> > @@ -207,8 +213,8 @@ int main(int argc, char **argv)
> > if (geteuid() != 0)
> > return ksft_exit_fail_msg("Not running as root, abort
> > testing.\n");
> >
> > - /* Detect AMD vendor */
> > - detect_amd();
> > + /* Detect AMD/INTEL vendor */
> > + detect_vendor();
> >
> > if (has_ben) {
> > /* Extract benchmark command from command line. */
> >
Best regards,
Tan Shaopeng
Powered by blists - more mailing lists