lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240704043132.28501-1-osalvador@suse.de>
Date: Thu,  4 Jul 2024 06:30:47 +0200
From: Oscar Salvador <osalvador@...e.de>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: linux-kernel@...r.kernel.org,
	linux-mm@...ck.org,
	Peter Xu <peterx@...hat.com>,
	Muchun Song <muchun.song@...ux.dev>,
	David Hildenbrand <david@...hat.com>,
	SeongJae Park <sj@...nel.org>,
	Miaohe Lin <linmiaohe@...wei.com>,
	Michal Hocko <mhocko@...e.com>,
	Matthew Wilcox <willy@...radead.org>,
	Christophe Leroy <christophe.leroy@...roup.eu>,
	Oscar Salvador <osalvador@...e.de>
Subject: [PATCH 00/45] hugetlb pagewalk unification

Hi all,

During Peter's talk at the LSFMM, it was agreed that one of the things
that need to be done in order to further integrate hugetlb into mm core,
is to unify generic and hugetlb pagewalkers.
I started with this one, which is unifying hugetlb into generic
pagewalk, instead of having its hugetlb_entry entries.
Which means that pmd_entry/pte_entry(for cont-pte) entries will also deal with
hugetlb vmas as well, and so will new pud_entry entries since hugetlb can be
pud mapped (devm pages as well but we seem not to care about those with
the exception of hmm code).

The outcome is this RFC.

Before you continue, let me clarify certain points:

This patchset is not yet finished, as there are things that 1) need more thought,
2) are still broken (like the hmm bits as I am clueless about that) 3)
some paths have not been tested at all.

The things I tested were:

 - memory-failure
 - smaps/numa_maps/pagemap (the latter only for pud/pmd, not
   cont-{pmd,ptes}
 - mempolicy

on arm64 (for 64KB and 32M hugetlb pages) and on x86_64 (for 2MB and 1GB
hugetlb pages).
More tests need to be conducted, and I plan to borrow a pp64le machine
to also carry out some tests there, but for now this is what my bandwith
allowed me to do.

I am well aware that there are two things that might scare people, one
being the number of patches, and the other being the amount of code
added.

For the former, I will by no means ask anyone to review 45 patches, but
since this patchset touches isolated paths (damon, mincore, hmm,
task_mmu, memory-failure, mempolicy), I will point out some people
that might be able to help me out with those different bits:

 - Miaohe for memory-failure bits
 - David for task_mmu bits
 - SeongJae Park for damon bits
 - Jerome for hmm bits
 - feel freel to join for the rest

I think that that might be a good approach, and instead of having
to review 45 patches, one has only to review at most 5 or 6.

For the latter, there is an explanation: hugetlb operates on ptes
(although it allocates puds/pmds and the operations work on that level too),
which means that now that we will handle PUD/PMD-mapped hugetlb with
{pud,pmd}_* operations, we need to introduce quite a few functions that
do not exist yet and we need from now onwards.

Although I am sending this out, this is not a "rfc ready material",
as I said there are still things that need to be improved/fixed/tested,
but I wanted to make it public nevertheless so we can gather some constructive
feedback that helps us moving in the right direction and to also widen the discussions.

So take this more of a "Hey, let me show what I am doing and call me out on
things you consider wrong".

Thanks in advance

Oscar Salvador (45):
  arch/x86: Drop own definition of pgd,p4d_leaf
  mm: Add {pmd,pud}_huge_lock helper
  mm/pagewalk: Move vma_pgtable_walk_begin and vma_pgtable_walk_end
    upfront
  mm/pagewalk: Only call pud_entry when we have a pud leaf
  mm/pagewalk: Enable walk_pmd_range to handle cont-pmds
  mm/pagewalk: Do not try to split non-thp pud or pmd leafs
  arch/s390: Enable __s390_enable_skey_pmd to handle hugetlb vmas
  fs/proc: Enable smaps_pmd_entry to handle PMD-mapped hugetlb vmas
  mm: Implement pud-version functions for swap and vm_normal_page_pud
  fs/proc: Create smaps_pud_range to handle PUD-mapped hugetlb vmas
  fs/proc: Enable smaps_pte_entry to handle cont-pte mapped hugetlb vmas
  fs/proc: Enable pagemap_pmd_range to handle hugetlb vmas
  mm: Implement pud-version uffd functions
  fs/proc: Create pagemap_pud_range to handle PUD-mapped hugetlb vmas
  fs/proc: Adjust pte_to_pagemap_entry for hugetlb vmas
  fs/proc: Enable pagemap_scan_pmd_entry to handle hugetlb vmas
  mm: Implement pud-version for pud_mkinvalid and pudp_establish
  fs/proc: Create pagemap_scan_pud_entry to handle PUD-mapped hugetlb
    vmas
  fs/proc: Enable gather_pte_stats to handle hugetlb vmas
  fs/proc: Enable gather_pte_stats to handle cont-pte mapped hugetlb
    vmas
  fs/proc: Create gather_pud_stats to handle PUD-mapped hugetlb pages
  mm/mempolicy: Enable queue_folios_pmd to handle hugetlb vmas
  mm/mempolicy: Create queue_folios_pud to handle PUD-mapped hugetlb
    vmas
  mm/memory_failure: Enable check_hwpoisoned_pmd_entry to handle hugetlb
    vmas
  mm/memory-failure: Create check_hwpoisoned_pud_entry to handle
    PUD-mapped hugetlb vmas
  mm/damon: Enable damon_young_pmd_entry to handle hugetlb vmas
  mm/damon: Create damon_young_pud_entry to handle PUD-mapped hugetlb
    vmas
  mm/damon: Enable damon_mkold_pmd_entry to handle hugetlb vmas
  mm/damon: Create damon_mkold_pud_entry to handle PUD-mapped hugetlb
    vmas
  mm,mincore: Enable mincore_pte_range to handle hugetlb vmas
  mm/mincore: Create mincore_pud_range to handle PUD-mapped hugetlb vmas
  mm/hmm: Enable hmm_vma_walk_pmd, to handle hugetlb vmas
  mm/hmm: Enable hmm_vma_walk_pud to handle PUD-mapped hugetlb vmas
  arch/powerpc: Skip hugetlb vmas in subpage_mark_vma_nohuge
  arch/s390: Skip hugetlb vmas in thp_split_mm
  fs/proc: Make clear_refs_test_walk skip hugetlb vmas
  mm/lock: Make mlock_test_walk skip hugetlb vmas
  mm/madvise: Make swapin_test_walk skip hugetlb vmas
  mm/madvise: Make madvise_cold_test_walk skip hugetlb vmas
  mm/madvise: Make madvise_free_test_walk skip hugetlb vmas
  mm/migrate_device: Make migrate_vma_test_walk skip hugetlb vmas
  mm/memcontrol: Make mem_cgroup_move_test_walk skip hugetlb vmas
  mm/memcontrol: Make mem_cgroup_count_test_walk skip hugetlb vmas
  mm/hugetlb_vmemmap: Make vmemmap_test_walk skip hugetlb vmas
  mm: Delete all hugetlb_entry entries

 arch/arm64/include/asm/pgtable.h             |  19 +
 arch/loongarch/include/asm/pgtable.h         |   8 +
 arch/mips/include/asm/pgtable.h              |   7 +
 arch/powerpc/include/asm/book3s/64/pgtable.h |   8 +-
 arch/powerpc/mm/book3s64/pgtable.c           |  15 +-
 arch/powerpc/mm/book3s64/subpage_prot.c      |   2 +
 arch/riscv/include/asm/pgtable.h             |  15 +
 arch/s390/mm/gmap.c                          |  37 +-
 arch/x86/include/asm/pgtable.h               | 199 +++++----
 fs/proc/task_mmu.c                           | 434 ++++++++++++-------
 include/asm-generic/pgtable_uffd.h           |  30 ++
 include/linux/mm.h                           |   4 +
 include/linux/mm_inline.h                    |  34 ++
 include/linux/pagewalk.h                     |  10 -
 include/linux/pgtable.h                      |  77 +++-
 include/linux/swapops.h                      |  27 ++
 mm/damon/ops-common.c                        |  21 +-
 mm/damon/vaddr.c                             | 173 ++++----
 mm/hmm.c                                     |  69 +--
 mm/hugetlb_vmemmap.c                         |  12 +
 mm/madvise.c                                 |  36 ++
 mm/memcontrol-v1.c                           |  24 +
 mm/memory-failure.c                          |  99 +++--
 mm/memory.c                                  |  51 +++
 mm/mempolicy.c                               | 121 +++---
 mm/migrate_device.c                          |  12 +
 mm/mincore.c                                 |  46 +-
 mm/mlock.c                                   |  12 +
 mm/mprotect.c                                |  10 -
 mm/pagewalk.c                                |  73 +---
 mm/pgtable-generic.c                         |  21 +
 31 files changed, 1089 insertions(+), 617 deletions(-)

-- 
2.26.2


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ