lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Fri, 31 May 2024 12:32:12 -0400
From: "Liam R. Howlett" <Liam.Howlett@...cle.com>
To: Suren Baghdasaryan <surenb@...gle.com>,
        Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc: Vlastimil Babka <vbabka@...e.cz>, sidhartha.kumar@...cle.com,
        Matthew Wilcox <willy@...radead.org>,
        Lorenzo Stoakes <lstoakes@...il.com>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>,
        linux-fsdevel@...r.kernel.org, bpf@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: [RFC PATCH 0/5] Avoid MAP_FIXED gap exposure

It is now possible to walk the vma tree using the rcu read locks and is
beneficial to do so to reduce lock contention.  Doing so while a
MAP_FIXED mapping is executing means that a reader may see a gap in the
vma tree that should never logically exist - and does not when using the
mmap lock in read mode.  The temporal gap exists because mmap_region()
calls munmap() prior to installing the new mapping.

This patch set stops rcu readers from seeing the temporal gap by
splitting up the munmap() function into two parts.  The first part
prepares the vma tree for modifications by doing the necessary splits
and tracks the vmas in a side tree.  The second part completes the
munmapping of the vmas after the vma tree has been overwritten (either
by a MAP_FIXED replacement vma or by a NULL in the munmap() case).

Please note that rcu walkers will still be able to see a temporary state
of split vmas that may be in the process of being removed, but the
temporal gap will not be exposed.  vma_start_write() are called on both
parts of the split vma, so this state is detectable.

I am sending this as an RFC as Andrii Nakryiko [1] and Suren
Baghdasaryan are both working on features that require the vma tree to
avoid exposing this temporal gap to rcu readers.

[1] https://lore.kernel.org/all/gkhzuurhqhtozk6u53ufkesbhtjse5ba6kovqm7mnzrqe3szma@3tpbspq7hxjl/

Liam R. Howlett (5):
  mm/mmap: Correctly position vma_iterator in __split_vma()
  mm/mmap: Split do_vmi_align_munmap() into a gather and complete
    operation
  mm/mmap: Introduce vma_munmap_struct for use in munmap operations
  mm/mmap: Change munmap to use vma_munmap_struct() for accounting and
    surrounding vmas
  mm/mmap: Use split munmap calls for MAP_FIXED

 mm/internal.h |  22 +++
 mm/mmap.c     | 382 +++++++++++++++++++++++++++++++-------------------
 2 files changed, 258 insertions(+), 146 deletions(-)

-- 
2.43.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ