[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5c4927af-120e-4c6b-9473-95490f4fcc90@oracle.com>
Date: Wed, 4 Feb 2026 18:53:34 +0100
From: William Roche <william.roche@...cle.com>
To: Jiaqi Yan <jiaqiyan@...gle.com>, linmiaohe@...wei.com,
harry.yoo@...cle.com, jane.chu@...cle.com
Cc: nao.horiguchi@...il.com, tony.luck@...el.com, wangkefeng.wang@...wei.com,
willy@...radead.org, akpm@...ux-foundation.org, osalvador@...e.de,
rientjes@...gle.com, duenwen@...gle.com, jthoughton@...gle.com,
jgg@...dia.com, ankita@...dia.com, peterx@...hat.com,
sidhartha.kumar@...cle.com, ziy@...dia.com, david@...hat.com,
dave.hansen@...ux.intel.com, muchun.song@...ux.dev, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH v3 2/3] selftests/mm: test userspace MFR for HugeTLB
hugepage
On 2/3/26 20:23, Jiaqi Yan wrote:
> Test the userspace memory failure recovery (MFR) policy for HugeTLB:
>
> 1. Create a memfd backed by HugeTLB and had MFD_MF_KEEP_UE_MAPPED set.
>
> 2. Allocate and map 4 hugepages to the process.
>
> 3. Create sub-threads to MADV_HWPOISON inner addresses of the 1st hugepage.
>
> 4. Check if the process gets correct SIGBUS for each poisoned raw page.
>
> 5. Check if all memory are still accessible and content valid.
>
> 6. Check if the poisoned hugepage is dealt with after memfd released.
>
> Two configurables in the test:
>
> - hugepage_size: size of the hugepage, 1G or 2M.
>
> - nr_hwp_pages: number of pages within the 1st hugepage to MADV_HWPOISON.
In this version, you are introducing this new test argument
"nr_hwp_pages" to indicate how many of the pre-defined offsets we want
to poison inside the hugepage (between 1 and 8).
But is there any advantage to give the choice to the user instead of
testing them all ?
As a suggestion, should we have this test program setting or verifying
the minimal number of hugepages of the right type, instead of relying on
the user to set them manually ?
And at the end, should we try to unpoison the impacted pages ? So that
the lab machine where the tests run can continue to use all its memory ?
Thanks for your feedback,
William.
Powered by blists - more mailing lists