linux-kernel - Re: [RFC PATCH RESEND v2 1/2] mm/memfd: Add support for F_SEAL_FUTURE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABi2SkVmdxuETrgucYA2RucV3D4UoaPkDrXZKvLGjfEGp1-v2A@mail.gmail.com>
Date: Mon, 6 Jan 2025 09:35:09 -0800
From: Jeff Xu <jeffxu@...omium.org>
To: Jann Horn <jannh@...gle.com>, Kees Cook <keescook@...omium.org>
Cc: "Isaac J. Manjarres" <isaacmanjarres@...gle.com>, lorenzo.stoakes@...cle.com, 
	Jeff Layton <jlayton@...nel.org>, Chuck Lever <chuck.lever@...cle.com>, 
	Alexander Aring <alex.aring@...il.com>, Andrew Morton <akpm@...ux-foundation.org>, 
	Shuah Khan <shuah@...nel.org>, surenb@...gle.com, kaleshsingh@...gle.com, 
	jstultz@...gle.com, aliceryhl@...gle.com, jeffxu@...gle.com, kees@...nel.org, 
	kernel-team@...roid.com, linux-fsdevel@...r.kernel.org, 
	linux-kernel@...r.kernel.org, linux-mm@...ck.org, 
	linux-kselftest@...r.kernel.org
Subject: Re: [RFC PATCH RESEND v2 1/2] mm/memfd: Add support for
 F_SEAL_FUTURE_EXEC to memfd

+ Kees because this is related to W^X memfd and security.

On Fri, Jan 3, 2025 at 7:04 AM Jann Horn <jannh@...gle.com> wrote:
>
> On Fri, Jan 3, 2025 at 12:32 AM Isaac J. Manjarres
> <isaacmanjarres@...gle.com> wrote:
> > Android currently uses the ashmem driver [1] for creating shared memory
> > regions between processes. Ashmem buffers can initially be mapped with
> > PROT_READ, PROT_WRITE, and PROT_EXEC. Processes can then use the
> > ASHMEM_SET_PROT_MASK ioctl command to restrict--never add--the
> > permissions that the buffer can be mapped with.
> >
> > Processes can remove the ability to map ashmem buffers as executable to
> > ensure that those buffers cannot be exploited to run unintended code.
>
> Is there really code out there that first maps an ashmem buffer with
> PROT_EXEC, then uses the ioctl to remove execute permission for future
> mappings? I don't see why anyone would do that.
>
> > For instance, suppose process A allocates a memfd that is meant to be
> > read and written by itself and another process, call it B.
> >
> > Process A shares the buffer with process B, but process B injects code
> > into the buffer, and compromises process A, such that it makes A map
> > the buffer with PROT_EXEC. This provides an opportunity for process A
> > to run the code that process B injected into the buffer.
> >
> > If process A had the ability to seal the buffer against future
> > executable mappings before sharing the buffer with process B, this
> > attack would not be possible.
>
> I think if you want to enforce such restrictions in a scenario where
> the attacker can already make the target process perform
> semi-arbitrary syscalls, it would probably be more reliable to enforce
> rules on executable mappings with something like SELinux policy and/or
> F_SEAL_EXEC.
>
I would like to second on the suggestion of  making this as part of F_SEAL_EXEC.

> > Android is currently trying to replace ashmem with memfd. However, memfd
> > does not have a provision to permanently remove the ability to map a
> > buffer as executable, and leaves itself open to the type of attack
> > described earlier. However, this should be something that can be
> > achieved via a new file seal.
> >
> > There are known usecases (e.g. CursorWindow [2]) where a process
> > maps a buffer with read/write permissions before restricting the buffer
> > to being mapped as read-only for future mappings.
>
> Here you're talking about write permission, but the patch is about
> execute permission?
>
> > The resulting VMA from the writable mapping has VM_MAYEXEC set, meaning
> > that mprotect() can change the mapping to be executable. Therefore,
> > implementing the seal similar to F_SEAL_WRITE would not be appropriate,
> > since it would not work with the CursorWindow usecase. This is because
> > the CursorWindow process restricts the mapping permissions to read-only
> > after the writable mapping is created. So, adding a file seal for
> > executable mappings that operates like F_SEAL_WRITE would fail.
> >
> > Therefore, add support for F_SEAL_FUTURE_EXEC, which is handled
> > similarly to F_SEAL_FUTURE_WRITE. This ensures that CursorWindow can
> > continue to create a writable mapping initially, and then restrict the
> > permissions on the buffer to be mappable as read-only by using both
> > F_SEAL_FUTURE_WRITE and F_SEAL_FUTURE_EXEC. After the seal is
> > applied, any calls to mmap() with PROT_EXEC will fail.
> >
> > [1] https://cs.android.com/android/kernel/superproject/+/common-android-mainline:common/drivers/staging/android/ashmem.c
> > [2] https://developer.android.com/reference/android/database/CursorWindow
> >
> > Signed-off-by: Isaac J. Manjarres <isaacmanjarres@...gle.com>
> > ---
> >  include/uapi/linux/fcntl.h |  1 +
> >  mm/memfd.c                 | 39 +++++++++++++++++++++++++++++++++++++-
> >  2 files changed, 39 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
> > index 6e6907e63bfc..ef066e524777 100644
> > --- a/include/uapi/linux/fcntl.h
> > +++ b/include/uapi/linux/fcntl.h
> > @@ -49,6 +49,7 @@
> >  #define F_SEAL_WRITE   0x0008  /* prevent writes */
> >  #define F_SEAL_FUTURE_WRITE    0x0010  /* prevent future writes while mapped */
> >  #define F_SEAL_EXEC    0x0020  /* prevent chmod modifying exec bits */
> > +#define F_SEAL_FUTURE_EXEC     0x0040 /* prevent future executable mappings */
> >  /* (1U << 31) is reserved for signed error codes */
> >
> >  /*
> > diff --git a/mm/memfd.c b/mm/memfd.c
> > index 5f5a23c9051d..cfd62454df5e 100644
> > --- a/mm/memfd.c
> > +++ b/mm/memfd.c
> > @@ -184,6 +184,7 @@ static unsigned int *memfd_file_seals_ptr(struct file *file)
> >  }
> >
> >  #define F_ALL_SEALS (F_SEAL_SEAL | \
> > +                    F_SEAL_FUTURE_EXEC |\
> >                      F_SEAL_EXEC | \
> >                      F_SEAL_SHRINK | \
> >                      F_SEAL_GROW | \
> > @@ -357,14 +358,50 @@ static int check_write_seal(unsigned long *vm_flags_ptr)
> >         return 0;
> >  }
> >
> > +static inline bool is_exec_sealed(unsigned int seals)
> > +{
> > +       return seals & F_SEAL_FUTURE_EXEC;
> > +}
> > +
> > +static int check_exec_seal(unsigned long *vm_flags_ptr)
> > +{
> > +       unsigned long vm_flags = *vm_flags_ptr;
> > +       unsigned long mask = vm_flags & (VM_SHARED | VM_EXEC);
> > +
> > +       /* Executability is not a concern for private mappings. */
> > +       if (!(mask & VM_SHARED))
> > +               return 0;
>
> Why is it not a concern for private mappings?
>
> > +       /*
> > +        * New PROT_EXEC and MAP_SHARED mmaps are not allowed when exec seal
> > +        * is active.
> > +        */
> > +       if (mask & VM_EXEC)
> > +               return -EPERM;
> > +
> > +       /*
> > +        * Prevent mprotect() from making an exec-sealed mapping executable in
> > +        * the future.
> > +        */
> > +       *vm_flags_ptr &= ~VM_MAYEXEC;
> > +
> > +       return 0;
> > +}
> > +
> >  int memfd_check_seals_mmap(struct file *file, unsigned long *vm_flags_ptr)
> >  {
> >         int err = 0;
> >         unsigned int *seals_ptr = memfd_file_seals_ptr(file);
> >         unsigned int seals = seals_ptr ? *seals_ptr : 0;
> >
> > -       if (is_write_sealed(seals))
> > +       if (is_write_sealed(seals)) {
> >                 err = check_write_seal(vm_flags_ptr);
> > +               if (err)
> > +                       return err;
> > +       }
> > +
> > +       if (is_exec_sealed(seals))
> > +               err = check_exec_seal(vm_flags_ptr);
> >
memfd_check_seals_mmap is only for mmap() path, right ?

How about the mprotect()  path ? i.e.  An attacker can first create a
RW VMA mapping for the memfd and later mprotect the VMA to be
executable.

Similar to the check_write_seal call , we might want to block mprotect
for write seal as well.

> >         return err;
> >  }
> > --
> > 2.47.1.613.gc27f4b7a9f-goog
> >
> >
> >
>