[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <cover.1752586090.git.lorenzo.stoakes@oracle.com>
Date: Tue, 15 Jul 2025 14:37:37 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: "Liam R . Howlett" <Liam.Howlett@...cle.com>,
David Hildenbrand <david@...hat.com>, Vlastimil Babka <vbabka@...e.cz>,
Jann Horn <jannh@...gle.com>, Pedro Falcato <pfalcato@...e.de>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Jeff Xu <jeffxu@...omium.org>
Subject: [PATCH v2 0/5] mseal cleanups, fixup MAP_PRIVATE file-backed case
Perform a number of cleanups to the mseal logic. Firstly, VM_SEALED is
treated differently from every other VMA flag, it really doesn't make sense
to do this, so we start by making this consistent with everything else.
Next we place the madvise logic where it belongs - in mm/madvise.c. It
really makes no sense to abstract this elsewhere. In doing so, we go to
great lengths to explain very clearly the previously very confusing logic
as to what sealed mappings are impacted here.
In doing so, we fix an existing logical oversight - previously we permitted
an madvise() discard operation for a sealed, read-only MAP_PRIVATE
file-backed mapping.
However this is incorrect. To see why consider:
1. A MAP_PRIVATE R/W file-backed mapping is established.
2. The mapping is written to, which backs it with anonymous memory.
3. The mapping is mprotect()'d read-only.
4. The mapping is mseal()'d.
At this point you have data that, once sealed, a user cannot alter, but a
discard operation can unrecoverably remove. This contradicts the semantics
of mseal(), so should not be permitted.
We then abstract out and explain the 'are there are any gaps in this range
in the mm?' check being performed as a prerequisite to mseal being
performed.
Finally, we simplify the actual mseal logic which is really quite
straightforward.
v2:
* Propagated tags, thanks everyone!
* Updated can_madvise_modify() to a more logical order re: the checks
performed, as per David.
* Replaced vma_is_anonymous() check (which was, in the original code, a
vma->vm_file or vma->vm_ops check) with a vma->vm_flags & VM_SHARED
check - to explicitly check for shared mappings vs private to preclude
MAP_PRIVATE-mapping file-baked mappings, as per David.
* Made range_contains_unmapped() static and placed in mm/mseal.c to avoid
encouraging any other internal users towards this rather silly pattern,
as per Pedro and Liam.
v1:
https://lore.kernel.org/all/cover.1752497324.git.lorenzo.stoakes@oracle.com/
Lorenzo Stoakes (5):
mm/mseal: always define VM_SEALED
mm/mseal: update madvise() logic
mm/mseal: small cleanups
mm/mseal: Simplify and rename VMA gap check
mm/mseal: rework mseal apply logic
include/linux/mm.h | 6 +-
mm/madvise.c | 63 ++++++++-
mm/mseal.c | 169 ++++++------------------
mm/vma.h | 23 +---
tools/testing/selftests/mm/mseal_test.c | 3 +-
tools/testing/vma/vma_internal.h | 6 +-
6 files changed, 110 insertions(+), 160 deletions(-)
--
2.50.1
Powered by blists - more mailing lists