linux-kernel - Re: [PATCH v3 5/5] selftests/resctrl: Add non-contiguous CBMs CAT test

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <jbgkiwfkotntcdzhwf27dceit4w3j37sumkayixmyuee6zyaqg@un22a2iljgmr>
Date: Wed, 31 Jan 2024 13:55:16 +0100
From: Maciej Wieczor-Retman <maciej.wieczor-retman@...el.com>
To: Reinette Chatre <reinette.chatre@...el.com>
CC: <fenghua.yu@...el.com>, <shuah@...nel.org>,
	<linux-kernel@...r.kernel.org>, <linux-kselftest@...r.kernel.org>,
	<ilpo.jarvinen@...ux.intel.com>
Subject: Re: [PATCH v3 5/5] selftests/resctrl: Add non-contiguous CBMs CAT
 test

Hi!

On 2024-01-26 at 13:10:18 -0800, Reinette Chatre wrote:
>Hi Maciej,
>
>On 1/25/2024 3:13 AM, Maciej Wieczor-Retman wrote:
>> Add tests for both L2 and L3 CAT to verify the return values
>> generated by writing non-contiguous CBMs don't contradict the
>> reported non-contiguous support information.
>> 
>> Use a logical XOR to confirm return value of write_schemata() and
>> non-contiguous CBMs support information match.
>> 
>> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@...el.com>
>> ---
>> Changelog v3:
>> - Roll back __cpuid_count part. (Reinette)
>> - Update function name to read sparse_masks file.
>> - Roll back get_cache_level() changes.
>> - Add ksft_print_msg() to contiguous schemata write error handling
>>   (Reinette).
>> 
>> Changelog v2:
>> - Redo the patch message. (Ilpo)
>> - Tidy up __cpuid_count calls. (Ilpo)
>> - Remove redundant AND in noncont_mask calculations (Ilpo)
>> - Fix bit_center offset.
>> - Add newline before function return. (Ilpo)
>> - Group non-contiguous tests with CAT tests. (Ilpo)
>> - Use a helper for reading sparse_masks file. (Ilpo)
>> - Make get_cache_level() available in other source files. (Ilpo)
>> 
>>  tools/testing/selftests/resctrl/cat_test.c    | 81 +++++++++++++++++++
>>  tools/testing/selftests/resctrl/resctrl.h     |  2 +
>>  .../testing/selftests/resctrl/resctrl_tests.c |  2 +
>>  3 files changed, 85 insertions(+)
>> 
>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
>> index 39fc9303b8e8..9086bf359072 100644
>> --- a/tools/testing/selftests/resctrl/cat_test.c
>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>> @@ -294,6 +294,71 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
>>  	return ret;
>>  }
>>  
>> +static int noncont_cat_run_test(const struct resctrl_test *test,
>> +				const struct user_params *uparams)
>> +{
>> +	unsigned long full_cache_mask, cont_mask, noncont_mask;
>> +	unsigned int eax, ebx, ecx, edx, ret, sparse_masks;
>> +	char schemata[64];
>> +	int bit_center;
>> +
>> +	/* Check to compare sparse_masks content to CPUID output. */
>> +	ret = resource_info_unsigned_get(test->resource, "sparse_masks", &sparse_masks);
>> +	if (ret)
>> +		return ret;
>> +
>> +	if (!strcmp(test->resource, "L3"))
>> +		__cpuid_count(0x10, 1, eax, ebx, ecx, edx);
>> +	else if (!strcmp(test->resource, "L2"))
>> +		__cpuid_count(0x10, 2, eax, ebx, ecx, edx);
>> +	else
>> +		return -EINVAL;
>> +
>> +	if (sparse_masks != ((ecx >> 3) & 1)) {
>> +		ksft_print_msg("CPUID output doesn't match 'sparse_masks' file content!\n");
>> +		return -1;
>
>If I understand correctly this falls into the "test failure" [1] category
>and should return 1? ...
>
>> +	}
>> +
>> +	/* Write checks initialization. */
>> +	ret = get_full_cbm(test->resource, &full_cache_mask);
>> +	if (ret < 0)
>> +		return ret;
>> +	bit_center = count_bits(full_cache_mask) / 2;
>> +	cont_mask = full_cache_mask >> bit_center;
>> +
>> +	/* Contiguous mask write check. */
>> +	snprintf(schemata, sizeof(schemata), "%lx", cont_mask);
>> +	ret = write_schemata("", schemata, uparams->cpu, test->resource);
>> +	if (ret) {
>> +		ksft_print_msg("Write of contiguous CBM failed\n");
>> +		return ret;
>
>... although here I think the goal to distinguish between test error and test failure
>falls apart since it is not possible to tell within the test if the failure is
>because of error in the test or if test failed.

Is there even a distinction between test error and failure in resctrl selftest?
I've been looking at it for a while and can't find any instances where
ksft_test_result_error() would be used. Everywhere I look it's either pass or
fail. By grep-ing over all selftests I found only five tests that use
ksft_test_result_error().

Furthermore there is this one "TODO" in kselftests.h:

	/* TODO: how does "error" differ from "fail" or "skip"? */

If you meant the distintion less literally then I'd say the sparse_masks
comparison to CPUID would be a failure. What I had in mind is that it tries to
validate a resctrl interface relevant to non-contiguous CBMs. If it fails
there is probably something wrong with the code concerning non-contiguous CBMs.
On the other hand writing contiguous CBMs shouldn't fail as far as the
non-contiguous CBMs in CAT test is concerned. So if that fails there might be
something wrong on a higher level and I'd say that can be more of an error than
a failure.

But I'm just saying how I undestood it so far. If there is some clear
distinction between error and failure definitions I could try to separate it
more explicitly.

>
>Reinette
>
>[1] https://lore.kernel.org/all/33787043-5823-6de4-4e5c-a24a136ba541@linux.intel.com/
>

-- 
Kind regards
Maciej Wieczór-Retman