[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a4566c51-7a4e-4371-9922-b819cf2b11dc@redhat.com>
Date: Thu, 12 Jun 2025 09:34:31 +0200
From: David Hildenbrand <david@...hat.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org, nvdimm@...ts.linux.dev,
linux-cxl@...r.kernel.org, Alistair Popple <apopple@...dia.com>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>, Vlastimil Babka
<vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
Zi Yan <ziy@...dia.com>, Baolin Wang <baolin.wang@...ux.alibaba.com>,
Nico Pache <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
Dev Jain <dev.jain@....com>, Dan Williams <dan.j.williams@...el.com>,
Oscar Salvador <osalvador@...e.de>
Subject: Re: [PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and
vmf_insert_pfn_pud() fixes
On 12.06.25 01:08, Andrew Morton wrote:
> On Wed, 11 Jun 2025 14:06:51 +0200 David Hildenbrand <david@...hat.com> wrote:
>
>> While working on improving vm_normal_page() and friends, I stumbled
>> over this issues: refcounted "normal" pages must not be marked
>> using pmd_special() / pud_special().
>
> Why is this?
The two patches for that refer to the rules documented for
vm_normal_page(), how it could mislead pmd_special()/pud_special()
users, and how the harm so far is fortunately still limited.
It's all about how we identify refcounted folios vs. pfn mappings /
decide what's normal and what's special.
>
>>
>> ...
>>
>> I spent too much time trying to get the ndctl tests mentioned by Dan
>> running (.config tweaks, memmap= setup, ... ), without getting them to
>> pass even without these patches. Some SKIP, some FAIL, some sometimes
>> suddenly SKIP on first invocation, ... instructions unclear or the tests
>> are shaky. This is how far I got:
>
> I won't include this in the [0/N] - it doesn't seem helpful for future
> readers of the patchset.
Yes, trim it down to "ran ndctl tests, tests are shaky and ahrd to run,
but the results indicate that the relevant stuff seems to keep working".
... combined with the Tested-by by Dan.
>
> I'll give the patchset a run in mm-new, but it feels like some more
> baking is needed?
Fortunately Dan and Alistair managed to get the tests run properly. So I
don't have to waste another valuable 4 hours of my life on testing some
simple fixes that only stand in between me and doing the actual work in
that area I want to get done.
>
> The [1/N] has cc:stable but there's nothing in there to explain this
> decision. How does the issues affect userspace?
My reasoning was: Getting cachemodes in page table entries wrong sounds
... bad? At least to me :)
PAT code is confusing (when/how we could we actually mess up the
cachemode?), so it's hard to decide when this actually hits, and what
the exact results in which scenario would be. I tried to find out, but
cannot spend another hour digging through that horrible code.
So if someone has a problem with "stable" here, we can drop it. But the
fix is simple.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists