[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f65237fe-a1b9-4d63-9a06-dd7a49765c9f@linuxfoundation.org>
Date: Thu, 5 Sep 2024 12:06:47 -0600
From: Shuah Khan <skhan@...uxfoundation.org>
To: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
Cc: Muhammad Usama Anjum <usama.anjum@...labora.com>,
Shuah Khan <shuah@...nel.org>, Reinette Chatre <reinette.chatre@...el.com>,
linux-kselftest@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
Shaopeng Tan <tan.shaopeng@...fujitsu.com>, Fenghua Yu
<fenghua.yu@...el.com>,
Maciej Wieczór-Retman <maciej.wieczor-retman@...el.com>,
Shuah Khan <skhan@...uxfoundation.org>
Subject: Re: [PATCH v4 0/4] selftests: Fix cpuid / vendor checking build
issues
On 9/4/24 06:54, Ilpo Järvinen wrote:
> On Wed, 4 Sep 2024, Shuah Khan wrote:
>
>> On 9/4/24 06:18, Ilpo Järvinen wrote:
>>> On Tue, 3 Sep 2024, Shuah Khan wrote:
>>>
>>>> On 9/3/24 08:45, Ilpo Järvinen wrote:
>>>>> This series first generalizes resctrl selftest non-contiguous CAT check
>>>>> to not assume non-AMD vendor implies Intel. Second, it improves
>>>>> selftests such that the use of __cpuid_count() does not lead into a
>>>>> build failure (happens at least on ARM).
>>>>>
>>>>> While ARM does not currently support resctrl features, there's an
>>>>> ongoing work to enable resctrl support also for it on the kernel side.
>>>>> In any case, a common header such as kselftest.h should have a proper
>>>>> fallback in place for what it provides, thus it seems justified to fix
>>>>> this common level problem on the common level rather than e.g.
>>>>> disabling build for resctrl selftest for archs lacking resctrl support.
>>>>>
>>>>> I've dropped reviewed and tested by tags from the last patch in v3 due
>>>>> to major changes into the makefile logic. So it would be helpful if
>>>>> Muhammad could retest with this version.
>>>>>
>>>>> Acquiring ARCH in lib.mk will likely allow some cleanup into some
>>>>> subdirectory makefiles but that is left as future work because this
>>>>> series focuses in fixing cpuid/build.
>>>>
>>>>>
>>>>> v4:
>>>>> - New patch to reorder x86 selftest makefile to avoid clobbering CFLAGS
>>>>> (would cause __cpuid_count() related build fail otherwise)
>>>>>
>>>> I don't like the way this patch series is mushrooming. I am not
>>>> convinced that changes to lib.mk and x86 Makefile are necessary.
>>>
>>> I didn't like it either what I found from the various makefiles. I think
>>> there are many things done which conflict with what lib.mk seems to try to
>>> do.
>>>
>>
>> Some of it by desig. lib.mk offers framework for common things. There
>> are provisions to override like in the case of x86, powerpc. lib.mk
>> tries to be flexible as well.
>>
>>> I tried to ask in the first submission what test I should use in the
>>> header file as I'm not very familiar with how arch specific is done in
>>> userspace in the first place nor how it should be done within kselftest
>>> framework.
>>>
>>
>> Thoughts on cpuid:
>>
>> - It is x86 specific. Moving this to kselftest.h was done to avoid
>> duplicate. However now we are running into arm64/arm compile
>> errors due to this which need addressing one way or the other.
>>
>> I have some ideas on how to solve this - but I need answers to
>> the following questions.
>>
>> This is a question for you and Usama.
>>
>> - Does resctrl run on arm64/arm and what's the output?
>> - Can all other tests in resctrl other tests except
>> noncont_cat_run_test?
>> - If so send me the output.
>
> Hi Shuah,
>
> As mentioned in my coverletter above, resctrl does not currently support
> arm but there's an ongoing work to add arm support. On kernel side it
> requires major refactoring to move non-arch specific stuff out from
> arch/x86 so has (predictably) taken long time.
>
> The resctrl selftests are mostly written in arch independent way (*) but
> there's also a way to limit a test only to CPUs from a particular vendor.
> And now this noncont_cat_run_test needs to use cpuid only on Intel CPUs
> (to read the supported flag), it's not needed even on AMD CPUs as they
> always support non-contiguous CAT bitmask.
>
> So to summarize, it would be possible to disable resctrl test for non-x86
> but it does not address the underlying problem with cpuid which will just
> come back later I think.
>
> Alternatively, if there's some a good way in C code to do ifdeffery around
> that cpuid call, I could make that too, but I need to know which symbol to
> use for that ifdef.
>
> (*) The cache topology may make some selftest unusable on new archs but
> not the selftest code itself.
>
>
I agree that suppressing resctrl build is not a solution. The real problem
is is in defining __cpuid_count() in common code path.
I fixed it and send patch in. As I was testing I noticed the following on
AMD platform:
- it ran the L3_NONCONT_CAT test which is expected.
# # Starting L3_NONCONT_CAT test ...
# # Mounting resctrl to "/sys/fs/resctrl"
# ARCH_AMD - supports non-contiguous CBM
# # Write schema "L3:0=ff" to resctrl FS
# # Write schema "L3:0=fc3f" to resctrl FS
# ok 5 L3_NONCONT_CAT: test
- It went on to run L2_NONCONT_CAT - failed
# ok 6 # SKIP Hardware does not support L2_NONCONT_CAT or L2_NONCONT_CAT is disabled
Does it make sense to run both L3_NONCONT_CAT and L2_NONCONT_CAT
on AMD? Maybe it is? resctrl checks L3 or L2 support on Intel.
Anyway - the problem is fixed now. Please review and test.
thanks,
-- Shuah
Powered by blists - more mailing lists