[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dvstixx4pey6euns6xttep5bbc4jhz6smtgheijviwkbawnqbm@tqhbg4hzeiog>
Date: Thu, 12 Jun 2025 11:56:22 +1000
From: Alistair Popple <apopple@...dia.com>
To: David Hildenbrand <david@...hat.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
nvdimm@...ts.linux.dev, linux-cxl@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>, Vlastimil Babka <vbabka@...e.cz>,
Mike Rapoport <rppt@...nel.org>, Suren Baghdasaryan <surenb@...gle.com>,
Michal Hocko <mhocko@...e.com>, Zi Yan <ziy@...dia.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>, Nico Pache <npache@...hat.com>,
Ryan Roberts <ryan.roberts@....com>, Dev Jain <dev.jain@....com>,
Dan Williams <dan.j.williams@...el.com>, Oscar Salvador <osalvador@...e.de>, stable@...r.kernel.org
Subject: Re: [PATCH v2 1/3] mm/huge_memory: don't ignore queried cachemode in
vmf_insert_pfn_pud()
On Wed, Jun 11, 2025 at 02:06:52PM +0200, David Hildenbrand wrote:
> We setup the cache mode but ... don't forward the updated pgprot to
> insert_pfn_pud().
>
> Only a problem on x86-64 PAT when mapping PFNs using PUDs that
> require a special cachemode.
>
> Fix it by using the proper pgprot where the cachemode was setup.
>
> Identified by code inspection.
>
> Fixes: 7b806d229ef1 ("mm: remove vmf_insert_pfn_xxx_prot() for huge page-table entries")
> Cc: <stable@...r.kernel.org>
> Signed-off-by: David Hildenbrand <david@...hat.com>
> ---
> mm/huge_memory.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index d3e66136e41a3..49b98082c5401 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1516,10 +1516,9 @@ static pud_t maybe_pud_mkwrite(pud_t pud, struct vm_area_struct *vma)
> }
>
> static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr,
> - pud_t *pud, pfn_t pfn, bool write)
> + pud_t *pud, pfn_t pfn, pgprot_t prot, bool write)
> {
> struct mm_struct *mm = vma->vm_mm;
> - pgprot_t prot = vma->vm_page_prot;
> pud_t entry;
>
> if (!pud_none(*pud)) {
> @@ -1581,7 +1580,7 @@ vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write)
> pfnmap_setup_cachemode_pfn(pfn_t_to_pfn(pfn), &pgprot);
>
> ptl = pud_lock(vma->vm_mm, vmf->pud);
> - insert_pfn_pud(vma, addr, vmf->pud, pfn, write);
> + insert_pfn_pud(vma, addr, vmf->pud, pfn, pgprot, write);
> spin_unlock(ptl);
>
> return VM_FAULT_NOPAGE;
> @@ -1625,7 +1624,7 @@ vm_fault_t vmf_insert_folio_pud(struct vm_fault *vmf, struct folio *folio,
> add_mm_counter(mm, mm_counter_file(folio), HPAGE_PUD_NR);
> }
> insert_pfn_pud(vma, addr, vmf->pud, pfn_to_pfn_t(folio_pfn(folio)),
> - write);
> + vma->vm_page_prot, write);
Actually It's not immediately obvious to me why we don't call track_pfn_insert()
and forward the pgprot here as well. Prior to me adding vmf_insert_folio_pud()
device DAX would call vmf_insert_pfn_pud(), and the intent at least seems to
have been to change pgprot for that (and we did for the PTE/PMD versions).
However now that the ZONE_DEVICE folios are refcounted normally I switched
device dax to using vmf_insert_folio_*() which never changes pgprot based on x86
PAT. So I think we probably need to either add that to vmf_insert_folio_*() or
a new variant or make it the responsibility of callers to figure out the correct
pgprot.
> spin_unlock(ptl);
>
> return VM_FAULT_NOPAGE;
> --
> 2.49.0
>
Powered by blists - more mailing lists