[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e11ba418-4184-4f4f-add5-18a5edaa0f34@redhat.com>
Date: Thu, 12 Jun 2025 10:27:11 +0200
From: David Hildenbrand <david@...hat.com>
To: Dan Williams <dan.j.williams@...el.com>,
Alistair Popple <apopple@...dia.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org, nvdimm@...ts.linux.dev,
linux-cxl@...r.kernel.org, Andrew Morton <akpm@...ux-foundation.org>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>, Vlastimil Babka
<vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
Zi Yan <ziy@...dia.com>, Baolin Wang <baolin.wang@...ux.alibaba.com>,
Nico Pache <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
Dev Jain <dev.jain@....com>, Oscar Salvador <osalvador@...e.de>,
marc.herbert@...ux.intel.com
Subject: Re: [PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and
vmf_insert_pfn_pud() fixes
On 12.06.25 09:18, David Hildenbrand wrote:
> On 12.06.25 06:20, Dan Williams wrote:
>> Alistair Popple wrote:
>>> On Wed, Jun 11, 2025 at 02:06:51PM +0200, David Hildenbrand wrote:
>>>> This is v2 of
>>>> "[PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special
>>>> in vmf_insert_folio_*()"
>>>> Now with one additional fix, based on mm/mm-unstable.
>>>>
>>>> While working on improving vm_normal_page() and friends, I stumbled
>>>> over this issues: refcounted "normal" pages must not be marked
>>>> using pmd_special() / pud_special().
>>>>
>>>> Fortunately, so far there doesn't seem to be serious damage.
>>>>
>>>> I spent too much time trying to get the ndctl tests mentioned by Dan
>>>> running (.config tweaks, memmap= setup, ... ), without getting them to
>>>> pass even without these patches. Some SKIP, some FAIL, some sometimes
>>>> suddenly SKIP on first invocation, ... instructions unclear or the tests
>>>> are shaky. This is how far I got:
>>>
>>> FWIW I had a similar experience, although I eventually got the FAIL cases below
>>> to pass. I forget exactly what I needed to tweak for that though :-/
>>
>> Add Marc who has been working to clean the documentation up to solve the
>> reproducibility problem with standing up new environments to run these
>> tests.
>
> I was about to send some doc improvements myself, but I didn't manage to
> get the tests running in the first place ... even after trying hard :)
>
> I think there is also one issue with a test that requires you to
> actually install ndctl ... and some tests seem to temporarily fail with
> weird issues regarding "file size problems with /proc/kallsyms",
> whereby, ... there are no such file size problems :)
>
> All a bit shaky. The "memmap=" stuff is not documented anywhere for the
> tests, which is required for some tests I think. Maybe it should be
> added, not sure how big of an area we actually need, though.
>
>>
>> http://lore.kernel.org/20250521002640.1700283-1-marc.herbert@linux.intel.com
>>
>
> I think I have CONFIG_XFS_FS=m (instead of y) and CONFIG_DAX=y (instead
> of =m), and CONFIG_NFIT_SECURITY_DEBUG not set (instead of =y).
>
> Let me try with these settings adjusted.
Yeah, no. Unfortunately doesn't make it work with my debug config. Maybe with the
defconfig as raised by Marc it would do ... maybe will try that later.
# meson test -C build --suite ndctl:dax
ninja: Entering directory `/root/ndctl/build'
[1/70] Generating version.h with a custom command
1/13 ndctl:dax / daxdev-errors.sh OK 14.60s
2/13 ndctl:dax / multi-dax.sh OK 4.28s
3/13 ndctl:dax / sub-section.sh SKIP 0.25s exit status 77
4/13 ndctl:dax / dax-dev OK 1.00s
5/13 ndctl:dax / dax-ext4.sh OK 23.60s
6/13 ndctl:dax / dax-xfs.sh OK 23.74s
7/13 ndctl:dax / device-dax OK 40.61s
8/13 ndctl:dax / revoke-devmem OK 0.98s
9/13 ndctl:dax / device-dax-fio.sh SKIP 0.10s exit status 77
10/13 ndctl:dax / daxctl-devices.sh SKIP 0.16s exit status 77
11/13 ndctl:dax / daxctl-create.sh FAIL 2.53s exit status 1
>>> DAXCTL=/root/ndctl/build/daxctl/daxctl DATA_PATH=/root/ndctl/test MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 MALLOC_PERTURB_=167 LD_LIBRARY_PATH=/root/ndctl/build/cxl/lib:/root/ndctl/build/daxctl/lib:/root/ndctl/build/ndctl/lib TEST_PATH=/root/ndctl/build/test UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 NDCTL=/root/ndctl/build/ndctl/ndctl ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 /root/ndctl/test/daxctl-create.sh
12/13 ndctl:dax / dm.sh FAIL 0.24s exit status 1
>>> DAXCTL=/root/ndctl/build/daxctl/daxctl DATA_PATH=/root/ndctl/test MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 LD_LIBRARY_PATH=/root/ndctl/build/cxl/lib:/root/ndctl/build/daxctl/lib:/root/ndctl/build/ndctl/lib TEST_PATH=/root/ndctl/build/test MALLOC_PERTURB_=27 NDCTL=/root/ndctl/build/ndctl/ndctl ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 /root/ndctl/test/dm.sh
13/13 ndctl:dax / mmap.sh OK 343.67s
Ok: 8
Expected Fail: 0
Fail: 2
Unexpected Pass: 0
Skipped: 3
Timeout: 0
Full log written to /root/ndctl/build/meson-logs/testlog.txt
After compilation, I can see that I again have "CONFIG_DAX=y" in my config.
And for the DAX setting in "make menuconfig" I can see:
Symbol: DAX [=y]
...
Selected by [y]:
- FS_DAX [=y] && MMU [=y] && (ZONE_DEVICE [=y] || FS_DAX_LIMITED [=n]
Selected by [m]:
- BLK_DEV_PMEM [=m] && LIBNVDIMM [=m]
So I guess, as requested in the doc "CONFIG_FS_DAX=y" combined with
"CONFIG_DAX=m" is impossible to achieve?
===
sub-section.sh complains about
++ /root/ndctl/build/ndctl/ndctl list -R -b ACPI.NFIT
+ json=
++ echo
++ jq -r '[.[] | select(.available_size >= 67108864)][0].dev'
+ region=
++ echo
++ jq -r '[.[] | select(.available_size >= 67108864)][0].available_size'
+ avail=
+ '[' -z ']'
+ exit 77
Not sure what's the problem in my environment. I thought we would be emulating
ACPI.NFIT.
===
device-dax-fio.sh complains about
kernel 6.16.0-rc1-00069-g0ede5baa0b46: missing fio, skipping...
So I guess I just need to install "fio" to make it fly.
Yes, with that the test is passing now.
===
daxctl-devices.sh complains about
++ reset_dev
++ /root/ndctl/build/ndctl/ndctl destroy-namespace -f -b ACPI.NFIT 'Error at linn
e 33'
error destroying namespaces: No such device or address
destroyed 0 namespaces
++ exit 77
No idea.
===
daxctl-create.sh complains about
+ /root/ndctl/build/daxctl/daxctl reconfigure-device -m devdax -f dax1.0
libdaxctl: daxctl_dev_enable: dax1.0: failed to enable
error reconfiguring devices: Invalid argument
reconfigured 0 devices
++ cleanup 54
++ printf 'Error at line %d\n' 54
++ [[ -n dax1.0 ]]
++ reset_dax
++ test -n dax1.0
++ /root/ndctl/build/daxctl/daxctl disable-device -r 1 all
disabled 1 device
++ /root/ndctl/build/daxctl/daxctl destroy-device -r 1 all
destroyed 1 device
++ /root/ndctl/build/daxctl/daxctl reconfigure-device -s '' dax1.0
reconfigured 1 device
++ exit 1
Again, no idea ... :(
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists