[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6afc2e67-3ecb-41a5-9c8f-00ecd64f035a@redhat.com>
Date: Tue, 17 Jun 2025 11:25:14 +0200
From: David Hildenbrand <david@...hat.com>
To: Alistair Popple <apopple@...dia.com>, akpm@...ux-foundation.org
Cc: linux-mm@...ck.org, gerald.schaefer@...ux.ibm.com,
dan.j.williams@...el.com, jgg@...pe.ca, willy@...radead.org,
linux-kernel@...r.kernel.org, nvdimm@...ts.linux.dev,
linux-fsdevel@...r.kernel.org, linux-ext4@...r.kernel.org,
linux-xfs@...r.kernel.org, jhubbard@...dia.com, hch@....de,
zhang.lyra@...il.com, debug@...osinc.com, bjorn@...nel.org,
balbirs@...dia.com, lorenzo.stoakes@...cle.com,
linux-arm-kernel@...ts.infradead.org, loongarch@...ts.linux.dev,
linuxppc-dev@...ts.ozlabs.org, linux-riscv@...ts.infradead.org,
linux-cxl@...r.kernel.org, dri-devel@...ts.freedesktop.org, John@...ves.net,
m.szyprowski@...sung.com
Subject: Re: [PATCH v2 02/14] mm: Filter zone device pages returned from
folio_walk_start()
On 16.06.25 13:58, Alistair Popple wrote:
> Previously dax pages were skipped by the pagewalk code as pud_special() or
> vm_normal_page{_pmd}() would be false for DAX pages. Now that dax pages are
> refcounted normally that is no longer the case, so the pagewalk code will
> start returning them.
>
> Most callers already explicitly filter for DAX or zone device pages so
> don't need updating. However some don't, so add checks to those callers.
>
> Signed-off-by: Alistair Popple <apopple@...dia.com>
>
> ---
>
> Changes since v1:
>
> - Dropped "mm/pagewalk: Skip dax pages in pagewalk" and replaced it
> with this new patch for v2
>
> - As suggested by David and Jason we can filter the folios in the
> callers instead of doing it in folio_start_walk(). Most callers
> already do this (see below).
>
> I audited all callers of folio_walk_start() and found the following:
>
> mm/ksm.c:
>
> break_ksm() - doesn't need to filter zone_device pages because the can
> never be KSM pages.
>
> get_mergeable_page() - already filters out zone_device pages.
> scan_get_next_rmap_iterm() - already filters out zone_device_pages.
>
> mm/huge_memory.c:
>
> split_huge_pages_pid() - already checks for DAX with
> vma_not_suitable_for_thp_split()
>
> mm/rmap.c:
>
> make_device_exclusive() - only works on anonymous pages, although
> there'd be no issue with finding a DAX page even if support was extended
> to file-backed pages.
>
> mm/migrate.c:
>
> add_folio_for_migration() - already checks the vma with vma_migratable()
> do_pages_stat_array() - explicitly checks for zone_device folios
>
> kernel/event/uprobes.c:
>
> uprobe_write_opcode() - only works on anonymous pages, not sure if
> zone_device could ever work so add an explicit check
>
> arch/s390/mm/fault.c:
>
> do_secure_storage_access() - not sure so be conservative and add a check
>
> arch/s390/kernel/uv.c:
>
> make_hva_secure() - not sure so be conservative and add a check
> ---
> arch/s390/kernel/uv.c | 2 +-
> arch/s390/mm/fault.c | 2 +-
> kernel/events/uprobes.c | 2 +-
> 3 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
> index b99478e..55aa280 100644
> --- a/arch/s390/kernel/uv.c
> +++ b/arch/s390/kernel/uv.c
> @@ -424,7 +424,7 @@ int make_hva_secure(struct mm_struct *mm, unsigned long hva, struct uv_cb_header
> return -EFAULT;
> }
> folio = folio_walk_start(&fw, vma, hva, 0);
> - if (!folio) {
> + if (!folio || folio_is_zone_device(folio)) {
> mmap_read_unlock(mm);
> return -ENXIO;
> }
> diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
> index e1ad05b..df1a067 100644
> --- a/arch/s390/mm/fault.c
> +++ b/arch/s390/mm/fault.c
> @@ -449,7 +449,7 @@ void do_secure_storage_access(struct pt_regs *regs)
> if (!vma)
> return handle_fault_error(regs, SEGV_MAPERR);
> folio = folio_walk_start(&fw, vma, addr, 0);
> - if (!folio) {
> + if (!folio || folio_is_zone_device(folio)) {
> mmap_read_unlock(mm);
> return;
> }
Curious, does s390 even support ZONE_DEVICE and could trigger this?
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 8a601df..f774367 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -539,7 +539,7 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct vm_area_struct *vma,
> }
>
> ret = 0;
> - if (unlikely(!folio_test_anon(folio))) {
> + if (unlikely(!folio_test_anon(folio) || folio_is_zone_device(folio))) {
> VM_WARN_ON_ONCE(is_register);
> folio_put(folio);
> goto out;
I wonder if __uprobe_write_opcode() would just work with anon device folios?
We only modify page content, and conditionally zap the page. Would there
be a problem with anon device folios?
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists