lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <990ce9cf-0e48-432c-a29f-0bd1704eede4@redhat.com>
Date: Thu, 12 Jun 2025 09:18:53 +0200
From: David Hildenbrand <david@...hat.com>
To: Dan Williams <dan.j.williams@...el.com>,
 Alistair Popple <apopple@...dia.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org, nvdimm@...ts.linux.dev,
 linux-cxl@...r.kernel.org, Andrew Morton <akpm@...ux-foundation.org>,
 Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
 "Liam R. Howlett" <Liam.Howlett@...cle.com>, Vlastimil Babka
 <vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
 Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
 Zi Yan <ziy@...dia.com>, Baolin Wang <baolin.wang@...ux.alibaba.com>,
 Nico Pache <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
 Dev Jain <dev.jain@....com>, Oscar Salvador <osalvador@...e.de>,
 marc.herbert@...ux.intel.com
Subject: Re: [PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and
 vmf_insert_pfn_pud() fixes

On 12.06.25 06:20, Dan Williams wrote:
> Alistair Popple wrote:
>> On Wed, Jun 11, 2025 at 02:06:51PM +0200, David Hildenbrand wrote:
>>> This is v2 of
>>> 	"[PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special
>>> 	 in vmf_insert_folio_*()"
>>> Now with one additional fix, based on mm/mm-unstable.
>>>
>>> While working on improving vm_normal_page() and friends, I stumbled
>>> over this issues: refcounted "normal" pages must not be marked
>>> using pmd_special() / pud_special().
>>>
>>> Fortunately, so far there doesn't seem to be serious damage.
>>>
>>> I spent too much time trying to get the ndctl tests mentioned by Dan
>>> running (.config tweaks, memmap= setup, ... ), without getting them to
>>> pass even without these patches. Some SKIP, some FAIL, some sometimes
>>> suddenly SKIP on first invocation, ... instructions unclear or the tests
>>> are shaky. This is how far I got:
>>
>> FWIW I had a similar experience, although I eventually got the FAIL cases below
>> to pass. I forget exactly what I needed to tweak for that though :-/
> 
> Add Marc who has been working to clean the documentation up to solve the
> reproducibility problem with standing up new environments to run these
> tests.

I was about to send some doc improvements myself, but I didn't manage to 
get the tests running in the first place ... even after trying hard :)

I think there is also one issue with a test that requires you to 
actually install ndctl ... and some tests seem to temporarily fail with 
weird issues regarding "file size problems with /proc/kallsyms", 
whereby, ... there are no such file size problems :)

All a bit shaky. The "memmap=" stuff is not documented anywhere for the 
tests, which is required for some tests I think. Maybe it should be 
added, not sure how big of an area we actually need, though.

> 
> http://lore.kernel.org/20250521002640.1700283-1-marc.herbert@linux.intel.com
> 

I think I have CONFIG_XFS_FS=m (instead of y) and CONFIG_DAX=y (instead 
of =m), and CONFIG_NFIT_SECURITY_DEBUG not set (instead of =y).

Let me try with these settings adjusted.

> There is also the run_qemu project that automates build an environment for this.
> 
> https://github.com/pmem/run_qemu
> 
> ...but comes with its own set of quirks.
> 
> I have the following fixups applied to my environment to get his going on
> Fedora 42 with v6.16-rc1:
> 
> diff --git a/README.md b/README.md
> index 37314db7a155..8e06908d5921 100644
> --- a/README.md
> +++ b/README.md
> @@ -84,6 +84,11 @@ loaded.  To build and install nfit_test.ko:
>      CONFIG_TRANSPARENT_HUGEPAGE=y
>      ```
>   
> +1. Install the following packages, (Fedora instructions):
> +   ```
> +   dnf install e2fsprogs xfsprogs parted jq trace-cmd hostname fio fio-engine-dev-dax
> +   ```
> +
>   1. Build and install the unit test enabled libnvdimm modules in the
>      following order.  The unit test modules need to be in place prior to
>      the `depmod` that runs during the final `modules_install`
> diff --git a/test/dax.sh b/test/dax.sh
> index 3ffbc8079eba..98faaf0eb9b2 100755
> --- a/test/dax.sh
> +++ b/test/dax.sh
> @@ -37,13 +37,14 @@ run_test() {
>   	rc=1
>   	while read -r p; do
>   		[[ $p ]] || continue
> +		[[ $p == cpus=* ]] && continue
>   		if [ "$count" -lt 10 ]; then
>   			if [ "$p" != "0x100" ] && [ "$p" != "NOPAGE" ]; then
>   				cleanup "$1"
>   			fi
>   		fi
>   		count=$((count + 1))
> -	done < <(trace-cmd report | awk '{ print $21 }')
> +	done < <(trace-cmd report | awk '{ print $NF }')
>   
>   	if [ $count -lt 10 ]; then
>   		cleanup "$1"
> 
> In the meantime, do not hesitate to ask me to run these tests.

Yes, thanks, and thanks for running these tests.

> 
> FWIW with these patches on top of -rc1 I get:
> 
> ---
> 
> [root@...t ndctl]# meson test -C build --suite ndctl:dax
> ninja: Entering directory `/root/git/ndctl/build'
> [168/168] Linking target ndctl/ndctl
>   1/13 ndctl:dax / daxdev-errors.sh          OK              12.60s
>   2/13 ndctl:dax / multi-dax.sh              OK               2.47s
>   3/13 ndctl:dax / sub-section.sh            OK               6.30s
>   4/13 ndctl:dax / dax-dev                   OK               0.04s
>   5/13 ndctl:dax / dax-ext4.sh               OK               3.04s
>   6/13 ndctl:dax / dax-xfs.sh                OK               3.10s
>   7/13 ndctl:dax / device-dax                OK               9.66s
>   8/13 ndctl:dax / revoke-devmem             OK               0.22s
>   9/13 ndctl:dax / device-dax-fio.sh         OK              32.32s
> 10/13 ndctl:dax / daxctl-devices.sh         OK               2.31s
> 11/13 ndctl:dax / daxctl-create.sh          SKIP             0.25s   exit status 77
> 12/13 ndctl:dax / dm.sh                     OK               1.00s
> 13/13 ndctl:dax / mmap.sh                   OK              62.27s
> 
> Ok:                12
> Fail:              0
> Skipped:           1
> 
> Full log written to /root/git/ndctl/build/meson-logs/testlog.txt
> 
> ---
> 
> Note that the daxctl-create.sh skip is a known unrelated v6.16-rc1 regression
> fixed with this set:
> 
> http://lore.kernel.org/20250607033228.1475625-1-dan.j.williams@intel.com
> 
> You can add:
> 
> Tested-by: Dan Williams <dan.j.williams@...el.com>
> 

Thanks!

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ