[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <de0e9e64-9833-4c60-8234-30b709b135db@collabora.com>
Date: Tue, 23 Jan 2024 12:51:16 +0500
From: Muhammad Usama Anjum <usama.anjum@...labora.com>
To: Ryan Roberts <ryan.roberts@....com>
Cc: Muhammad Usama Anjum <usama.anjum@...labora.com>, kernel@...labora.com,
linux-mm@...ck.org, linux-kselftest@...r.kernel.org,
linux-kernel@...r.kernel.org, Andrew Morton <akpm@...ux-foundation.org>,
Shuah Khan <shuah@...nel.org>
Subject: Re: [PATCH] selftests/mm: run_vmtests.sh: add missing tests
On 1/22/24 2:59 PM, Ryan Roberts wrote:
>>>> +CATEGORY="hugetlb" run_test ./hugetlb-read-hwpoison
>>>
>>> The addition of this test causes 2 later tests to fail with ENOMEM. I suspect
>>> its a side-effect of marking the hugetlbs as hwpoisoned? (just a guess based on
>>> the test name!). Once a page is marked poisoned, is there a way to un-poison it?
>>> If not, I suspect that's why it wasn't part of the standard test script in the
>>> first place.
>> hugetlb-read-hwpoison failed as probably the fix in the kernel for the test
>> hasn't been merged in the kernel. The other tests (uffd-stress) aren't
>> failing on my end and on CI [1][2]
>
> To be clear, hugetlb-read-hwpoison isn't failing for me, its just causing the
> subsequent tests uffd-stress tests to fail. Both of those subsequent tests are
> allocating hugetlbs so my guess is that since this test is marking some hugetlbs
> as poisoned, there are no longer enough for the subsequent tests.
>
>>
>> [1] https://lava.collabora.dev/scheduler/job/12577207#L3677
>> [2] https://lava.collabora.dev/scheduler/job/12577229#L4027
>>
>> Maybe its configurations issue which is exposed now. Not sure. Maybe
>> hugetlb-read-hwpoison is changing some configuration and not restoring it.
>
> Well yes - its marking some hugetlb pages as HWPOISONED.
>
>> Maybe your system has less number of hugetlb pages.
>
> YEs probably; What is hugetlb-read-hwpoison's requirement for size and number of
> hugetlb pages? the run_vmtests.sh script allocates the required number of
> default-sized hugetlb pages before running any tests (I guess this value should
> be increased for hugetlb-read-hwpoison's requirements?).
>
> Additionally, our CI preallocates non-default sizes from the kernel command line
> at boot. Happy to increase these if you can tell me what the new requirement is:
I'm not sure about the exact requirement of the number of hugetlb for these
tests. But I specify hugepages=1000 and tests work for me.
I've sent v2 [1]. Would it be possible to run your CI on that and share
results before we merge that one?
[1]
https://lore.kernel.org/all/20240123073615.920324-1-usama.anjum@collabora.com
>
> hugepagesz=1G hugepages=0:2,1:2 hugepagesz=32M hugepages=0:2,1:2
> default_hugepagesz=2M hugepages=0:64,1:64 hugepagesz=64K hugepages=0:2,1:2
>
> Thanks,
> Ryan
>
--
BR,
Muhammad Usama Anjum
Powered by blists - more mailing lists