[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <edc31538-3aa1-a35a-ea67-b13f626a84ec@amd.com>
Date: Fri, 7 Jun 2024 13:16:22 -0500
From: "Moger, Babu" <bmoger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, babu.moger@....com,
fenghua.yu@...el.com, shuah@...nel.org
Cc: linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org,
ilpo.jarvinen@...ux.intel.com, maciej.wieczor-retman@...el.com,
peternewman@...gle.com, eranian@...gle.com
Subject: Re: [PATCH] selftests/resctrl: Fix noncont_cat_run_test for AMD
Hi Reinette,
On 6/6/2024 6:58 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 6/6/24 4:09 PM, Moger, Babu wrote:
>> Hi Reinette,
>>
>>
>> On 6/6/2024 3:33 PM, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 6/5/24 2:36 PM, Babu Moger wrote:
>>>> The selftest noncont_cat_run_test fails on AMD with the warnings.
>>>> Reason
>>>> is, AMD supports non contiguous CBM masks but does not report it via
>>>> CPUID.
>>>>
>>>> Update noncont_cat_run_test to check for the vendor when verifying
>>>> CPUID.
>>>>
>>>> Fixes: ae638551ab64 ("selftests/resctrl: Add non-contiguous CBMs CAT
>>>> test")
>>>> Signed-off-by: Babu Moger <babu.moger@....com>
>>>> ---
>>>> This was part of the series
>>>> https://lore.kernel.org/lkml/cover.1708637563.git.babu.moger@amd.com/
>>>> Sending this as a separate fix per review comments.
>>>> ---
>>>> tools/testing/selftests/resctrl/cat_test.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/tools/testing/selftests/resctrl/cat_test.c
>>>> b/tools/testing/selftests/resctrl/cat_test.c
>>>> index d4dffc934bc3..b2988888786e 100644
>>>> --- a/tools/testing/selftests/resctrl/cat_test.c
>>>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>>>> @@ -308,7 +308,7 @@ static int noncont_cat_run_test(const struct
>>>> resctrl_test *test,
>>>> else
>>>> return -EINVAL;
>>>> - if (sparse_masks != ((ecx >> 3) & 1)) {
>>>> + if ((get_vendor() == ARCH_INTEL) && sparse_masks != ((ecx >> 3)
>>>> & 1)) {
>>>> ksft_print_msg("CPUID output doesn't match 'sparse_masks'
>>>> file content!\n");
>>>> return 1;
>>>> }
>>>
>>> Since AMD does not report this support via CPUID it does not seem
>>> appropriate to use CPUID at all on AMD when doing the hardware check.
>>> I think the above check makes it difficult to understand what is
>>> different
>>> on AMD.
>>>
>>> What if instead there is a new function, for example,
>>> "static bool arch_supports_noncont_cat(const struct resctrl_test *test)"
>>> that returns true if the hardware supports non-contiguous CBM?
>>
>> Sure.
>>
>>>
>>> The vendor check can be in there to make it obvious what is going on:
>>>
>>> /* AMD always supports non-contiguous CBM. */
>>> if (get_vendor() == AMD)
>>> return true;
>>>
>>> /* CPUID check for Intel here. */
>>>
>>> The "sparse_masks" from kernel can then be checked against
>>> hardware support with an appropriate (no mention of CPUID)
>>> error message if this fails.
>>>
>>
>> Something like this?
>>
>>
>> diff --git a/tools/testing/selftests/resctrl/cat_test.c
>> b/tools/testing/selftests/resctrl/cat_test.c
>> index d4dffc934bc3..b75d220f29f6 100644
>> --- a/tools/testing/selftests/resctrl/cat_test.c
>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>> @@ -288,11 +288,30 @@ static int cat_run_test(const struct
>> resctrl_test *test, const struct user_param
>> return ret;
>> }
>>
>> +static bool arch_supports_noncont_cat(const struct resctrl_test *test)
>> +{
>> + unsigned int eax, ebx, ecx, edx;
>> +
>> + /* AMD always supports non-contiguous CBM. */
>> + if (get_vendor() == ARCH_AMD) {
>> + return true;
>> + } else {
>
> The else can be dropped since it follows a return.
Sure
> The rest of the code can be prefixed with a matching
> comment like:
> /* Intel support for non-contiguous CBM needs to be discovered. */
Sure
>
> (please feel free to improve)
>
>> + if (!strcmp(test->resource, "L3"))
>> + __cpuid_count(0x10, 1, eax, ebx, ecx, edx);
>> + else if (!strcmp(test->resource, "L2"))
>> + __cpuid_count(0x10, 2, eax, ebx, ecx, edx);
>> + else
>> + return false;
>> +
>> + return ((ecx >> 3) & 1);
>> + }
>> +}
>> +
>> static int noncont_cat_run_test(const struct resctrl_test *test,
>> const struct user_params *uparams)
>> {
>> unsigned long full_cache_mask, cont_mask, noncont_mask;
>> - unsigned int eax, ebx, ecx, edx, sparse_masks;
>> + unsigned int sparse_masks;
>> int bit_center, ret;
>> char schemata[64];
>>
>> @@ -301,15 +320,8 @@ static int noncont_cat_run_test(const struct
>> resctrl_test *test,
>> if (ret)
>> return ret;
>>
>> - if (!strcmp(test->resource, "L3"))
>> - __cpuid_count(0x10, 1, eax, ebx, ecx, edx);
>> - else if (!strcmp(test->resource, "L2"))
>> - __cpuid_count(0x10, 2, eax, ebx, ecx, edx);
>> - else
>> - return -EINVAL;
>> -
>> - if (sparse_masks != ((ecx >> 3) & 1)) {
>> - ksft_print_msg("CPUID output doesn't match
>> 'sparse_masks' file content!\n");
>> + if (!(arch_supports_noncont_cat(test) && sparse_masks)) {
>> + ksft_print_msg("Hardware does not support
>> non-contiguous CBM!\n");
>
> Please fix the test as well as the message. It is not an error if
> hardware does
> not support non-contiguous CBM. It is an error if the hardware and
> kernel disagrees whether
> non-contiguous CBM is supported.
Not sure about this comment.
Did you mean?
if (!arch_supports_noncont_cat(test)) {
ksft_print_msg("Hardware does not support
non-contiguous CBM!\n");
return 0;
} else if (arch_supports_noncont_cat(test) && !sparse_masks)) {
ksft_print_msg("Hardware and kernel support for
non-contiguous CBM does not match!\n");
return 1;
}
--
- Babu Moger
Powered by blists - more mailing lists