lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <cover.1750433500.git.lorenzo.stoakes@oracle.com>
Date: Fri, 20 Jun 2025 16:33:00 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: David Hildenbrand <david@...hat.com>, Zi Yan <ziy@...dia.com>,
        Baolin Wang <baolin.wang@...ux.alibaba.com>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>,
        Nico Pache <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
        Dev Jain <dev.jain@....com>, Barry Song <baohua@...nel.org>,
        Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Lance Yang <ioworker0@...il.com>, SeongJae Park <sj@...nel.org>,
        Suren Baghdasaryan <surenb@...gle.com>
Subject: [PATCH v2 0/5] madvise cleanup

This is a series of patches that helps address a number of historic
problems in the madvise() implementation:

* Eliminate the visitor pattern and having the code which is implemented
  for both the anon_vma_name implementation and ordinary madvise()
  operations use the same madvise_vma_behavior() implementation.

* Thread state through the madvise_behavior state object - this object,
  very usefully introduced by SJ, is already used to transmit state through
  operations. This series extends this by having all madvise() operations
  use this, including anon_vma_name.

* Thread range, VMA state through madvise_behavior - This helps avoid a lot
  of the confusing code around range and VMA state and again keeps things
  consistent and with a single 'source of truth'.

* Addressing the very strange behaviour around the passed around struct
  vm_area_struct **prev pointer - all read-only users do absolutely nothing
  with the prev pointer. The only function that uses it is
  madvise_update_vma(), and in all cases prev is always reset to
  VMA.

  Fix this by no longer having aything but madvise_update_vma() reference
  prev, and having madvise_walk_vmas() update prev in each
  instance. Additionally make it clear that the meaningful change in vma
  state is when madvise_update_vma() potentially merges a VMA, so
  explicitly retrieve the VMA in this case.

* Update and clarify the madvise_walk_vmas() function - this is a source of
  a great deal of confusion, so simplify, stop using prev = NULL to signify
  that the mmap lock has been dropped (!) and make that explicit, and add
  some comments to explain what's going on.

v2:
* Propagated tags (thanks everyone!)
* Don't separate out __MADV_SET_ANON_VMA_NAME and __MADV_SET_CLEAR_VMA_NAME,
  just use __MADV_SET_ANON_VMA_NAME as per Zi.
* Eliminate is_anon_vma_name() as no longer necessary, addressing Zi's concern
  around naming another way :)
* Put mm_struct abstraction of try_vma_read_lock() into 2/5 from 3/5 as per Zi.
* Added comment about get/put anon_vma_name in madvise_vma_behavior() as per
  Vlastimil.
* Renamed have_anon_name to set_new_anon_name to make it clear why we make an
  exception to this get/put behaviour in madvise_vma_behavior().
* Reworded 1/4 commit message to make it clearer what's being done as per
  Vlastimil.
* Avoid comma-separated decls in struct madvise_behavior_range as per Zi and
  Vlastimil.
* Put fix for silly development bug (range->start comparison to end not
  range->end) in 3/5 rather than 4/5 so as to eliminate it altogether,
  having fixed it during development but having not put the fix in the
  correct place :) as per Vlastimil.
* Rename end to last_end in madvise_walk_vmas() and added a comment for
  clarity as per Vlastimil.
* Update madvise_walk_vmas() comment to no longer refer to a visitor
  function.
* Separated out prev, vma fields in struct madvise_behavior as per
  Vlastimil.
* Added assert on not holding VMA lock whenever mmap lock is dropped and
  abstracted to mark_mmap_lock_dropped() so we always assert when we do
  this, based on discussion with Vlastimil.
* Removed duplicate comment about weird -ENOMEM unmapped error behaviour.

v1:
https://lore.kernel.org/all/cover.1750363557.git.lorenzo.stoakes@oracle.com/

Lorenzo Stoakes (5):
  mm/madvise: remove the visitor pattern and thread anon_vma state
  mm/madvise: thread mm_struct through madvise_behavior
  mm/madvise: thread VMA range state through madvise_behavior
  mm/madvise: thread all madvise state through madv_behavior
  mm/madvise: eliminate very confusing manipulation of prev VMA

 include/linux/huge_mm.h |   9 +-
 mm/khugepaged.c         |   9 +-
 mm/madvise.c            | 585 +++++++++++++++++++++-------------------
 3 files changed, 313 insertions(+), 290 deletions(-)

--
2.49.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ