[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG48ez1OQWsaFGBs_EPz+9C=FwddyKG=r1PMbfDtwHNow2SYOA@mail.gmail.com>
Date: Thu, 28 Nov 2024 18:45:46 +0100
From: Jann Horn <jannh@...gle.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, Julian Orth <ju.orth@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, "Liam R . Howlett" <Liam.Howlett@...cle.com>,
Vlastimil Babka <vbabka@...e.cz>, Shuah Khan <shuah@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] mm: reinstate ability to map write-sealed memfd
mappings read-only
On Thu, Nov 28, 2024 at 4:06 PM Lorenzo Stoakes
<lorenzo.stoakes@...cle.com> wrote:
> In commit 158978945f31 ("mm: perform the mapping_map_writable() check after
> call_mmap()") (and preceding changes in the same series) it became possible
> to mmap() F_SEAL_WRITE sealed memfd mappings read-only.
>
> This was previously unnecessarily disallowed, despite the man page
> documentation indicating that it would be, thereby limiting the usefulness
> of F_SEAL_WRITE logic.
>
> We fixed this by adapting logic that existed for the F_SEAL_FUTURE_WRITE
> seal (one which disallows future writes to the memfd) to also be used for
> F_SEAL_WRITE.
>
> For background - the F_SEAL_FUTURE_WRITE seal clears VM_MAYWRITE for a
> read-only mapping to disallow mprotect() from overriding the seal - an
> operation performed by seal_check_write(), invoked from shmem_mmap(), the
> f_op->mmap() hook used by shmem mappings.
>
> By extending this to F_SEAL_WRITE and critically - checking
> mapping_map_writable() to determine if we may map the memfd AFTER we invoke
> shmem_mmap() - the desired logic becomes possible. This is because
> mapping_map_writable() explicitly checks for VM_MAYWRITE, which we will
> have cleared.
>
> Commit 5de195060b2e ("mm: resolve faulty mmap_region() error path
> behaviour") unintentionally undid this logic by moving the
> mapping_map_writable() check before the shmem_mmap() hook is invoked,
> thereby regressing this change.
>
> We reinstate this functionality by moving the check out of shmem_mmap() and
> instead performing it in do_mmap() at the point at which VMA flags are
> being determined, which seems in any case to be a more appropriate place in
> which to make this determination.
>
> In order to achieve this we rework memfd seal logic to allow us access to
> this information using existing logic and eliminate the clearing of
> VM_MAYWRITE from seal_check_write() which we are performing in do_mmap()
> instead.
If we already check is_readonly_sealed() and strip VM_MAYWRITE in
do_mmap(), without holding any kind of lock or counter on the file
yet, then this check is clearly racy somehow, right? I think we have a
race where we intermittently reject shared-readonly mmap() calls?
Like:
process 1: calls mmap(PROT_READ, MAP_PRIVATE), checks is_readonly_sealed()
process 2: adds a F_SEAL_WRITE seal
process 1: enters mmap_region(), is_shared_maywrite() is true,
mapping_map_writable() fails
But even if we fix that, the same scenario would result in
F_SEAL_WRITE randomly failing depending on the ordering, so it's not
like we can actually do anything particularly sensible if these two
operations race. Taking a step back, read-only shared mappings of
F_SEAL_WRITE-sealed files are just kind of a bad idea because if
someone first creates a read-only shared mapping and *then* tries to
apply F_SEAL_WRITE, that won't work because the existing mapping will
be VM_MAYWRITE.
And the manpage is just misleading on interaction with shared mappings
in general, it says "Using the F_ADD_SEALS operation to set the
F_SEAL_WRITE seal fails with EBUSY if any writable, shared mapping
exists" when actually, it more or less fails if any shared mapping
exists at all.
@Julian Orth: Did you report this regression because this change
caused issues with existing userspace code?
> Reported-by: Julian Orth <ju.orth@...il.com>
> Closes: https://lore.kernel.org/all/CAHijbEUMhvJTN9Xw1GmbM266FXXv=U7s4L_Jem5x3AaPZxrYpQ@mail.gmail.com/
> Fixes: 5de195060b2e ("mm: resolve faulty mmap_region() error path behaviour")
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Powered by blists - more mailing lists