[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <990ce9cf-0e48-432c-a29f-0bd1704eede4@redhat.com>
Date: Thu, 12 Jun 2025 09:18:53 +0200
From: David Hildenbrand <david@...hat.com>
To: Dan Williams <dan.j.williams@...el.com>,
Alistair Popple <apopple@...dia.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org, nvdimm@...ts.linux.dev,
linux-cxl@...r.kernel.org, Andrew Morton <akpm@...ux-foundation.org>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>, Vlastimil Babka
<vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
Zi Yan <ziy@...dia.com>, Baolin Wang <baolin.wang@...ux.alibaba.com>,
Nico Pache <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
Dev Jain <dev.jain@....com>, Oscar Salvador <osalvador@...e.de>,
marc.herbert@...ux.intel.com
Subject: Re: [PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and
vmf_insert_pfn_pud() fixes
On 12.06.25 06:20, Dan Williams wrote:
> Alistair Popple wrote:
>> On Wed, Jun 11, 2025 at 02:06:51PM +0200, David Hildenbrand wrote:
>>> This is v2 of
>>> "[PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special
>>> in vmf_insert_folio_*()"
>>> Now with one additional fix, based on mm/mm-unstable.
>>>
>>> While working on improving vm_normal_page() and friends, I stumbled
>>> over this issues: refcounted "normal" pages must not be marked
>>> using pmd_special() / pud_special().
>>>
>>> Fortunately, so far there doesn't seem to be serious damage.
>>>
>>> I spent too much time trying to get the ndctl tests mentioned by Dan
>>> running (.config tweaks, memmap= setup, ... ), without getting them to
>>> pass even without these patches. Some SKIP, some FAIL, some sometimes
>>> suddenly SKIP on first invocation, ... instructions unclear or the tests
>>> are shaky. This is how far I got:
>>
>> FWIW I had a similar experience, although I eventually got the FAIL cases below
>> to pass. I forget exactly what I needed to tweak for that though :-/
>
> Add Marc who has been working to clean the documentation up to solve the
> reproducibility problem with standing up new environments to run these
> tests.
I was about to send some doc improvements myself, but I didn't manage to
get the tests running in the first place ... even after trying hard :)
I think there is also one issue with a test that requires you to
actually install ndctl ... and some tests seem to temporarily fail with
weird issues regarding "file size problems with /proc/kallsyms",
whereby, ... there are no such file size problems :)
All a bit shaky. The "memmap=" stuff is not documented anywhere for the
tests, which is required for some tests I think. Maybe it should be
added, not sure how big of an area we actually need, though.
>
> http://lore.kernel.org/20250521002640.1700283-1-marc.herbert@linux.intel.com
>
I think I have CONFIG_XFS_FS=m (instead of y) and CONFIG_DAX=y (instead
of =m), and CONFIG_NFIT_SECURITY_DEBUG not set (instead of =y).
Let me try with these settings adjusted.
> There is also the run_qemu project that automates build an environment for this.
>
> https://github.com/pmem/run_qemu
>
> ...but comes with its own set of quirks.
>
> I have the following fixups applied to my environment to get his going on
> Fedora 42 with v6.16-rc1:
>
> diff --git a/README.md b/README.md
> index 37314db7a155..8e06908d5921 100644
> --- a/README.md
> +++ b/README.md
> @@ -84,6 +84,11 @@ loaded. To build and install nfit_test.ko:
> CONFIG_TRANSPARENT_HUGEPAGE=y
> ```
>
> +1. Install the following packages, (Fedora instructions):
> + ```
> + dnf install e2fsprogs xfsprogs parted jq trace-cmd hostname fio fio-engine-dev-dax
> + ```
> +
> 1. Build and install the unit test enabled libnvdimm modules in the
> following order. The unit test modules need to be in place prior to
> the `depmod` that runs during the final `modules_install`
> diff --git a/test/dax.sh b/test/dax.sh
> index 3ffbc8079eba..98faaf0eb9b2 100755
> --- a/test/dax.sh
> +++ b/test/dax.sh
> @@ -37,13 +37,14 @@ run_test() {
> rc=1
> while read -r p; do
> [[ $p ]] || continue
> + [[ $p == cpus=* ]] && continue
> if [ "$count" -lt 10 ]; then
> if [ "$p" != "0x100" ] && [ "$p" != "NOPAGE" ]; then
> cleanup "$1"
> fi
> fi
> count=$((count + 1))
> - done < <(trace-cmd report | awk '{ print $21 }')
> + done < <(trace-cmd report | awk '{ print $NF }')
>
> if [ $count -lt 10 ]; then
> cleanup "$1"
>
> In the meantime, do not hesitate to ask me to run these tests.
Yes, thanks, and thanks for running these tests.
>
> FWIW with these patches on top of -rc1 I get:
>
> ---
>
> [root@...t ndctl]# meson test -C build --suite ndctl:dax
> ninja: Entering directory `/root/git/ndctl/build'
> [168/168] Linking target ndctl/ndctl
> 1/13 ndctl:dax / daxdev-errors.sh OK 12.60s
> 2/13 ndctl:dax / multi-dax.sh OK 2.47s
> 3/13 ndctl:dax / sub-section.sh OK 6.30s
> 4/13 ndctl:dax / dax-dev OK 0.04s
> 5/13 ndctl:dax / dax-ext4.sh OK 3.04s
> 6/13 ndctl:dax / dax-xfs.sh OK 3.10s
> 7/13 ndctl:dax / device-dax OK 9.66s
> 8/13 ndctl:dax / revoke-devmem OK 0.22s
> 9/13 ndctl:dax / device-dax-fio.sh OK 32.32s
> 10/13 ndctl:dax / daxctl-devices.sh OK 2.31s
> 11/13 ndctl:dax / daxctl-create.sh SKIP 0.25s exit status 77
> 12/13 ndctl:dax / dm.sh OK 1.00s
> 13/13 ndctl:dax / mmap.sh OK 62.27s
>
> Ok: 12
> Fail: 0
> Skipped: 1
>
> Full log written to /root/git/ndctl/build/meson-logs/testlog.txt
>
> ---
>
> Note that the daxctl-create.sh skip is a known unrelated v6.16-rc1 regression
> fixed with this set:
>
> http://lore.kernel.org/20250607033228.1475625-1-dan.j.williams@intel.com
>
> You can add:
>
> Tested-by: Dan Williams <dan.j.williams@...el.com>
>
Thanks!
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists