[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250821200701.1329277-1-david@redhat.com>
Date: Thu, 21 Aug 2025 22:06:26 +0200
From: David Hildenbrand <david@...hat.com>
To: linux-kernel@...r.kernel.org
Cc: David Hildenbrand <david@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Jason Gunthorpe <jgg@...dia.com>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>,
Vlastimil Babka <vbabka@...e.cz>,
Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>,
Michal Hocko <mhocko@...e.com>,
Jens Axboe <axboe@...nel.dk>,
Marek Szyprowski <m.szyprowski@...sung.com>,
Robin Murphy <robin.murphy@....com>,
John Hubbard <jhubbard@...dia.com>,
Peter Xu <peterx@...hat.com>,
Alexander Potapenko <glider@...gle.com>,
Marco Elver <elver@...gle.com>,
Dmitry Vyukov <dvyukov@...gle.com>,
Brendan Jackman <jackmanb@...gle.com>,
Johannes Weiner <hannes@...xchg.org>,
Zi Yan <ziy@...dia.com>,
Dennis Zhou <dennis@...nel.org>,
Tejun Heo <tj@...nel.org>,
Christoph Lameter <cl@...two.org>,
Muchun Song <muchun.song@...ux.dev>,
Oscar Salvador <osalvador@...e.de>,
x86@...nel.org,
linux-arm-kernel@...ts.infradead.org,
linux-mips@...r.kernel.org,
linux-s390@...r.kernel.org,
linux-crypto@...r.kernel.org,
linux-ide@...r.kernel.org,
intel-gfx@...ts.freedesktop.org,
dri-devel@...ts.freedesktop.org,
linux-mmc@...r.kernel.org,
linux-arm-kernel@...s.com,
linux-scsi@...r.kernel.org,
kvm@...r.kernel.org,
virtualization@...ts.linux.dev,
linux-mm@...ck.org,
io-uring@...r.kernel.org,
iommu@...ts.linux.dev,
kasan-dev@...glegroups.com,
wireguard@...ts.zx2c4.com,
netdev@...r.kernel.org,
linux-kselftest@...r.kernel.org,
linux-riscv@...ts.infradead.org,
Albert Ou <aou@...s.berkeley.edu>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
Alexandre Ghiti <alex@...ti.fr>,
Alex Dubov <oakad@...oo.com>,
Alex Williamson <alex.williamson@...hat.com>,
Andreas Larsson <andreas@...sler.com>,
Borislav Petkov <bp@...en8.de>,
Brett Creeley <brett.creeley@....com>,
Catalin Marinas <catalin.marinas@....com>,
Christian Borntraeger <borntraeger@...ux.ibm.com>,
Christophe Leroy <christophe.leroy@...roup.eu>,
Damien Le Moal <dlemoal@...nel.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
David Airlie <airlied@...il.com>,
"David S. Miller" <davem@...emloft.net>,
Doug Gilbert <dgilbert@...erlog.com>,
Heiko Carstens <hca@...ux.ibm.com>,
Herbert Xu <herbert@...dor.apana.org.au>,
Huacai Chen <chenhuacai@...nel.org>,
Ingo Molnar <mingo@...hat.com>,
"James E.J. Bottomley" <James.Bottomley@...senPartnership.com>,
Jani Nikula <jani.nikula@...ux.intel.com>,
"Jason A. Donenfeld" <Jason@...c4.com>,
Jason Gunthorpe <jgg@...pe.ca>,
Jesper Nilsson <jesper.nilsson@...s.com>,
Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>,
Kevin Tian <kevin.tian@...el.com>,
Lars Persson <lars.persson@...s.com>,
Madhavan Srinivasan <maddy@...ux.ibm.com>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
Maxim Levitsky <maximlevitsky@...il.com>,
Michael Ellerman <mpe@...erman.id.au>,
Nicholas Piggin <npiggin@...il.com>,
Niklas Cassel <cassel@...nel.org>,
Palmer Dabbelt <palmer@...belt.com>,
Paul Walmsley <paul.walmsley@...ive.com>,
Rodrigo Vivi <rodrigo.vivi@...el.com>,
Shameer Kolothum <shameerali.kolothum.thodi@...wei.com>,
Shuah Khan <shuah@...nel.org>,
Simona Vetter <simona@...ll.ch>,
Sven Schnelle <svens@...ux.ibm.com>,
Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
Thomas Gleixner <tglx@...utronix.de>,
Tvrtko Ursulin <tursulin@...ulin.net>,
Ulf Hansson <ulf.hansson@...aro.org>,
Vasily Gorbik <gor@...ux.ibm.com>,
WANG Xuerui <kernel@...0n.name>,
Will Deacon <will@...nel.org>,
Yishai Hadas <yishaih@...dia.com>
Subject: [PATCH RFC 00/35] mm: remove nth_page()
This is based on mm-unstable and was cross-compiled heavily.
I should probably have already dropped the RFC label but I want to hear
first if I ignored some corner case (SG entries?) and I need to do
at least a bit more testing.
I will only CC non-MM folks on the cover letter and the respective patch
to not flood too many inboxes (the lists receive all patches).
---
As discussed recently with Linus, nth_page() is just nasty and we would
like to remove it.
To recap, the reason we currently need nth_page() within a folio is because
on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the
memmap is allocated per memory section.
While buddy allocations cannot cross memory section boundaries, hugetlb
and dax folios can.
So crossing a memory section means that "page++" could do the wrong thing.
Instead, nth_page() on these problematic configs always goes from
page->pfn, to the go from (++pfn)->page, which is rather nasty.
Likely, many people have no idea when nth_page() is required and when
it might be dropped.
We refer to such problematic PFN ranges and "non-contiguous pages".
If we only deal with "contiguous pages", there is not need for nth_page().
Besides that "obvious" folio case, we might end up using nth_page()
within CMA allocations (again, could span memory sections), and in
one corner case (kfence) when processing memblock allocations (again,
could span memory sections).
So let's handle all that, add sanity checks, and remove nth_page().
Patch #1 -> #5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups
Patch #6 -> #12 : disallow folios to have non-contiguous pages
Patch #13 -> #20 : remove nth_page() usage within folios
Patch #21 : disallow CMA allocations of non-contiguous pages
Patch #22 -> #31 : sanity+check + remove nth_page() usage within SG entry
Patch #32 : sanity-check + remove nth_page() usage in
unpin_user_page_range_dirty_lock()
Patch #33 : remove nth_page() in kfence
Patch #34 : adjust stale comment regarding nth_page
Patch #35 : mm: remove nth_page()
A lot of this is inspired from the discussion at [1] between Linus, Jason
and me, so cudos to them.
[1] https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u
Cc: Andrew Morton <akpm@...ux-foundation.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Jason Gunthorpe <jgg@...dia.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: "Liam R. Howlett" <Liam.Howlett@...cle.com>
Cc: Vlastimil Babka <vbabka@...e.cz>
Cc: Mike Rapoport <rppt@...nel.org>
Cc: Suren Baghdasaryan <surenb@...gle.com>
Cc: Michal Hocko <mhocko@...e.com>
Cc: Jens Axboe <axboe@...nel.dk>
Cc: Marek Szyprowski <m.szyprowski@...sung.com>
Cc: Robin Murphy <robin.murphy@....com>
Cc: John Hubbard <jhubbard@...dia.com>
Cc: Peter Xu <peterx@...hat.com>
Cc: Alexander Potapenko <glider@...gle.com>
Cc: Marco Elver <elver@...gle.com>
Cc: Dmitry Vyukov <dvyukov@...gle.com>
Cc: Brendan Jackman <jackmanb@...gle.com>
Cc: Johannes Weiner <hannes@...xchg.org>
Cc: Zi Yan <ziy@...dia.com>
Cc: Dennis Zhou <dennis@...nel.org>
Cc: Tejun Heo <tj@...nel.org>
Cc: Christoph Lameter <cl@...two.org>
Cc: Muchun Song <muchun.song@...ux.dev>
Cc: Oscar Salvador <osalvador@...e.de>
Cc: x86@...nel.org
Cc: linux-arm-kernel@...ts.infradead.org
Cc: linux-mips@...r.kernel.org
Cc: linux-s390@...r.kernel.org
Cc: linux-crypto@...r.kernel.org
Cc: linux-ide@...r.kernel.org
Cc: intel-gfx@...ts.freedesktop.org
Cc: dri-devel@...ts.freedesktop.org
Cc: linux-mmc@...r.kernel.org
Cc: linux-arm-kernel@...s.com
Cc: linux-scsi@...r.kernel.org
Cc: kvm@...r.kernel.org
Cc: virtualization@...ts.linux.dev
Cc: linux-mm@...ck.org
Cc: io-uring@...r.kernel.org
Cc: iommu@...ts.linux.dev
Cc: kasan-dev@...glegroups.com
Cc: wireguard@...ts.zx2c4.com
Cc: netdev@...r.kernel.org
Cc: linux-kselftest@...r.kernel.org
Cc: linux-riscv@...ts.infradead.org
David Hildenbrand (35):
mm: stop making SPARSEMEM_VMEMMAP user-selectable
arm64: Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP"
s390/Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP"
x86/Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP"
wireguard: selftests: remove CONFIG_SPARSEMEM_VMEMMAP=y from qemu
kernel config
mm/page_alloc: reject unreasonable folio/compound page sizes in
alloc_contig_range_noprof()
mm/memremap: reject unreasonable folio/compound page sizes in
memremap_pages()
mm/hugetlb: check for unreasonable folio sizes when registering hstate
mm/mm_init: make memmap_init_compound() look more like
prep_compound_page()
mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap()
mm: sanity-check maximum folio size in folio_set_order()
mm: limit folio/compound page sizes in problematic kernel configs
mm: simplify folio_page() and folio_page_idx()
mm/mm/percpu-km: drop nth_page() usage within single allocation
fs: hugetlbfs: remove nth_page() usage within folio in
adjust_range_hwpoison()
mm/pagewalk: drop nth_page() usage within folio in folio_walk_start()
mm/gup: drop nth_page() usage within folio when recording subpages
io_uring/zcrx: remove "struct io_copy_cache" and one nth_page() usage
io_uring/zcrx: remove nth_page() usage within folio
mips: mm: convert __flush_dcache_pages() to
__flush_dcache_folio_pages()
mm/cma: refuse handing out non-contiguous page ranges
dma-remap: drop nth_page() in dma_common_contiguous_remap()
scatterlist: disallow non-contigous page ranges in a single SG entry
ata: libata-eh: drop nth_page() usage within SG entry
drm/i915/gem: drop nth_page() usage within SG entry
mspro_block: drop nth_page() usage within SG entry
memstick: drop nth_page() usage within SG entry
mmc: drop nth_page() usage within SG entry
scsi: core: drop nth_page() usage within SG entry
vfio/pci: drop nth_page() usage within SG entry
crypto: remove nth_page() usage within SG entry
mm/gup: drop nth_page() usage in unpin_user_page_range_dirty_lock()
kfence: drop nth_page() usage
block: update comment of "struct bio_vec" regarding nth_page()
mm: remove nth_page()
arch/arm64/Kconfig | 1 -
arch/mips/include/asm/cacheflush.h | 11 +++--
arch/mips/mm/cache.c | 8 ++--
arch/s390/Kconfig | 1 -
arch/x86/Kconfig | 1 -
crypto/ahash.c | 4 +-
crypto/scompress.c | 8 ++--
drivers/ata/libata-sff.c | 6 +--
drivers/gpu/drm/i915/gem/i915_gem_pages.c | 2 +-
drivers/memstick/core/mspro_block.c | 3 +-
drivers/memstick/host/jmb38x_ms.c | 3 +-
drivers/memstick/host/tifm_ms.c | 3 +-
drivers/mmc/host/tifm_sd.c | 4 +-
drivers/mmc/host/usdhi6rol0.c | 4 +-
drivers/scsi/scsi_lib.c | 3 +-
drivers/scsi/sg.c | 3 +-
drivers/vfio/pci/pds/lm.c | 3 +-
drivers/vfio/pci/virtio/migrate.c | 3 +-
fs/hugetlbfs/inode.c | 25 ++++------
include/crypto/scatterwalk.h | 4 +-
include/linux/bvec.h | 7 +--
include/linux/mm.h | 48 +++++++++++++++----
include/linux/page-flags.h | 5 +-
include/linux/scatterlist.h | 4 +-
io_uring/zcrx.c | 34 ++++---------
kernel/dma/remap.c | 2 +-
mm/Kconfig | 3 +-
mm/cma.c | 36 +++++++++-----
mm/gup.c | 13 +++--
mm/hugetlb.c | 23 ++++-----
mm/internal.h | 1 +
mm/kfence/core.c | 17 ++++---
mm/memremap.c | 3 ++
mm/mm_init.c | 13 ++---
mm/page_alloc.c | 5 +-
mm/pagewalk.c | 2 +-
mm/percpu-km.c | 2 +-
mm/util.c | 33 +++++++++++++
tools/testing/scatterlist/linux/mm.h | 1 -
.../selftests/wireguard/qemu/kernel.config | 1 -
40 files changed, 203 insertions(+), 150 deletions(-)
base-commit: c0e3b3f33ba7b767368de4afabaf7c1ddfdc3872
--
2.50.1
Powered by blists - more mailing lists