linux-kernel - Re: [PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <684a5594eb21d_2491100de@dwillia2-xfh.jf.intel.com.notmuch>
Date: Wed, 11 Jun 2025 21:20:36 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Alistair Popple <apopple@...dia.com>, David Hildenbrand <david@...hat.com>
CC: <linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
	<nvdimm@...ts.linux.dev>, <linux-cxl@...r.kernel.org>, Andrew Morton
	<akpm@...ux-foundation.org>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
	"Liam R. Howlett" <Liam.Howlett@...cle.com>, Vlastimil Babka
	<vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>, Suren Baghdasaryan
	<surenb@...gle.com>, Michal Hocko <mhocko@...e.com>, Zi Yan <ziy@...dia.com>,
	Baolin Wang <baolin.wang@...ux.alibaba.com>, Nico Pache <npache@...hat.com>,
	Ryan Roberts <ryan.roberts@....com>, Dev Jain <dev.jain@....com>, "Dan
 Williams" <dan.j.williams@...el.com>, Oscar Salvador <osalvador@...e.de>,
	<marc.herbert@...ux.intel.com>
Subject: Re: [PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and
 vmf_insert_pfn_pud() fixes

Alistair Popple wrote:
> On Wed, Jun 11, 2025 at 02:06:51PM +0200, David Hildenbrand wrote:
> > This is v2 of
> > 	"[PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special
> > 	 in vmf_insert_folio_*()"
> > Now with one additional fix, based on mm/mm-unstable.
> > 
> > While working on improving vm_normal_page() and friends, I stumbled
> > over this issues: refcounted "normal" pages must not be marked
> > using pmd_special() / pud_special().
> > 
> > Fortunately, so far there doesn't seem to be serious damage.
> > 
> > I spent too much time trying to get the ndctl tests mentioned by Dan
> > running (.config tweaks, memmap= setup, ... ), without getting them to
> > pass even without these patches. Some SKIP, some FAIL, some sometimes
> > suddenly SKIP on first invocation, ... instructions unclear or the tests
> > are shaky. This is how far I got:
> 
> FWIW I had a similar experience, although I eventually got the FAIL cases below
> to pass. I forget exactly what I needed to tweak for that though :-/

Add Marc who has been working to clean the documentation up to solve the
reproducibility problem with standing up new environments to run these
tests.

http://lore.kernel.org/20250521002640.1700283-1-marc.herbert@linux.intel.com

There is also the run_qemu project that automates build an environment for this.

https://github.com/pmem/run_qemu

...but comes with its own set of quirks.

I have the following fixups applied to my environment to get his going on
Fedora 42 with v6.16-rc1:

diff --git a/README.md b/README.md
index 37314db7a155..8e06908d5921 100644
--- a/README.md
+++ b/README.md
@@ -84,6 +84,11 @@ loaded.  To build and install nfit_test.ko:
    CONFIG_TRANSPARENT_HUGEPAGE=y
    ```
 
+1. Install the following packages, (Fedora instructions):
+   ```
+   dnf install e2fsprogs xfsprogs parted jq trace-cmd hostname fio fio-engine-dev-dax
+   ```
+
 1. Build and install the unit test enabled libnvdimm modules in the
    following order.  The unit test modules need to be in place prior to
    the `depmod` that runs during the final `modules_install`  
diff --git a/test/dax.sh b/test/dax.sh
index 3ffbc8079eba..98faaf0eb9b2 100755
--- a/test/dax.sh
+++ b/test/dax.sh
@@ -37,13 +37,14 @@ run_test() {
 	rc=1
 	while read -r p; do
 		[[ $p ]] || continue
+		[[ $p == cpus=* ]] && continue
 		if [ "$count" -lt 10 ]; then
 			if [ "$p" != "0x100" ] && [ "$p" != "NOPAGE" ]; then
 				cleanup "$1"
 			fi
 		fi
 		count=$((count + 1))
-	done < <(trace-cmd report | awk '{ print $21 }')
+	done < <(trace-cmd report | awk '{ print $NF }')
 
 	if [ $count -lt 10 ]; then
 		cleanup "$1"

In the meantime, do not hesitate to ask me to run these tests.

FWIW with these patches on top of -rc1 I get:

---

[root@...t ndctl]# meson test -C build --suite ndctl:dax
ninja: Entering directory `/root/git/ndctl/build'
[168/168] Linking target ndctl/ndctl
 1/13 ndctl:dax / daxdev-errors.sh          OK              12.60s
 2/13 ndctl:dax / multi-dax.sh              OK               2.47s
 3/13 ndctl:dax / sub-section.sh            OK               6.30s
 4/13 ndctl:dax / dax-dev                   OK               0.04s
 5/13 ndctl:dax / dax-ext4.sh               OK               3.04s
 6/13 ndctl:dax / dax-xfs.sh                OK               3.10s
 7/13 ndctl:dax / device-dax                OK               9.66s
 8/13 ndctl:dax / revoke-devmem             OK               0.22s
 9/13 ndctl:dax / device-dax-fio.sh         OK              32.32s
10/13 ndctl:dax / daxctl-devices.sh         OK               2.31s
11/13 ndctl:dax / daxctl-create.sh          SKIP             0.25s   exit status 77
12/13 ndctl:dax / dm.sh                     OK               1.00s
13/13 ndctl:dax / mmap.sh                   OK              62.27s

Ok:                12  
Fail:              0   
Skipped:           1   

Full log written to /root/git/ndctl/build/meson-logs/testlog.txt

---

Note that the daxctl-create.sh skip is a known unrelated v6.16-rc1 regression
fixed with this set:

http://lore.kernel.org/20250607033228.1475625-1-dan.j.williams@intel.com

You can add:

Tested-by: Dan Williams <dan.j.williams@...el.com>