[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <684a5594eb21d_2491100de@dwillia2-xfh.jf.intel.com.notmuch>
Date: Wed, 11 Jun 2025 21:20:36 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Alistair Popple <apopple@...dia.com>, David Hildenbrand <david@...hat.com>
CC: <linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
<nvdimm@...ts.linux.dev>, <linux-cxl@...r.kernel.org>, Andrew Morton
<akpm@...ux-foundation.org>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>, Vlastimil Babka
<vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>, Suren Baghdasaryan
<surenb@...gle.com>, Michal Hocko <mhocko@...e.com>, Zi Yan <ziy@...dia.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>, Nico Pache <npache@...hat.com>,
Ryan Roberts <ryan.roberts@....com>, Dev Jain <dev.jain@....com>, "Dan
Williams" <dan.j.williams@...el.com>, Oscar Salvador <osalvador@...e.de>,
<marc.herbert@...ux.intel.com>
Subject: Re: [PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and
vmf_insert_pfn_pud() fixes
Alistair Popple wrote:
> On Wed, Jun 11, 2025 at 02:06:51PM +0200, David Hildenbrand wrote:
> > This is v2 of
> > "[PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special
> > in vmf_insert_folio_*()"
> > Now with one additional fix, based on mm/mm-unstable.
> >
> > While working on improving vm_normal_page() and friends, I stumbled
> > over this issues: refcounted "normal" pages must not be marked
> > using pmd_special() / pud_special().
> >
> > Fortunately, so far there doesn't seem to be serious damage.
> >
> > I spent too much time trying to get the ndctl tests mentioned by Dan
> > running (.config tweaks, memmap= setup, ... ), without getting them to
> > pass even without these patches. Some SKIP, some FAIL, some sometimes
> > suddenly SKIP on first invocation, ... instructions unclear or the tests
> > are shaky. This is how far I got:
>
> FWIW I had a similar experience, although I eventually got the FAIL cases below
> to pass. I forget exactly what I needed to tweak for that though :-/
Add Marc who has been working to clean the documentation up to solve the
reproducibility problem with standing up new environments to run these
tests.
http://lore.kernel.org/20250521002640.1700283-1-marc.herbert@linux.intel.com
There is also the run_qemu project that automates build an environment for this.
https://github.com/pmem/run_qemu
...but comes with its own set of quirks.
I have the following fixups applied to my environment to get his going on
Fedora 42 with v6.16-rc1:
diff --git a/README.md b/README.md
index 37314db7a155..8e06908d5921 100644
--- a/README.md
+++ b/README.md
@@ -84,6 +84,11 @@ loaded. To build and install nfit_test.ko:
CONFIG_TRANSPARENT_HUGEPAGE=y
```
+1. Install the following packages, (Fedora instructions):
+ ```
+ dnf install e2fsprogs xfsprogs parted jq trace-cmd hostname fio fio-engine-dev-dax
+ ```
+
1. Build and install the unit test enabled libnvdimm modules in the
following order. The unit test modules need to be in place prior to
the `depmod` that runs during the final `modules_install`
diff --git a/test/dax.sh b/test/dax.sh
index 3ffbc8079eba..98faaf0eb9b2 100755
--- a/test/dax.sh
+++ b/test/dax.sh
@@ -37,13 +37,14 @@ run_test() {
rc=1
while read -r p; do
[[ $p ]] || continue
+ [[ $p == cpus=* ]] && continue
if [ "$count" -lt 10 ]; then
if [ "$p" != "0x100" ] && [ "$p" != "NOPAGE" ]; then
cleanup "$1"
fi
fi
count=$((count + 1))
- done < <(trace-cmd report | awk '{ print $21 }')
+ done < <(trace-cmd report | awk '{ print $NF }')
if [ $count -lt 10 ]; then
cleanup "$1"
In the meantime, do not hesitate to ask me to run these tests.
FWIW with these patches on top of -rc1 I get:
---
[root@...t ndctl]# meson test -C build --suite ndctl:dax
ninja: Entering directory `/root/git/ndctl/build'
[168/168] Linking target ndctl/ndctl
1/13 ndctl:dax / daxdev-errors.sh OK 12.60s
2/13 ndctl:dax / multi-dax.sh OK 2.47s
3/13 ndctl:dax / sub-section.sh OK 6.30s
4/13 ndctl:dax / dax-dev OK 0.04s
5/13 ndctl:dax / dax-ext4.sh OK 3.04s
6/13 ndctl:dax / dax-xfs.sh OK 3.10s
7/13 ndctl:dax / device-dax OK 9.66s
8/13 ndctl:dax / revoke-devmem OK 0.22s
9/13 ndctl:dax / device-dax-fio.sh OK 32.32s
10/13 ndctl:dax / daxctl-devices.sh OK 2.31s
11/13 ndctl:dax / daxctl-create.sh SKIP 0.25s exit status 77
12/13 ndctl:dax / dm.sh OK 1.00s
13/13 ndctl:dax / mmap.sh OK 62.27s
Ok: 12
Fail: 0
Skipped: 1
Full log written to /root/git/ndctl/build/meson-logs/testlog.txt
---
Note that the daxctl-create.sh skip is a known unrelated v6.16-rc1 regression
fixed with this set:
http://lore.kernel.org/20250607033228.1475625-1-dan.j.williams@intel.com
You can add:
Tested-by: Dan Williams <dan.j.williams@...el.com>
Powered by blists - more mailing lists