lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240524.160158-custard.odds.smutty.cuff-caukvmB4EWP9@cyphar.com>
Date: Fri, 24 May 2024 09:12:14 -0700
From: Aleksa Sarai <cyphar@...har.com>
To: Jeff Xu <jeffxu@...gle.com>
Cc: David Rheinsberg <david@...dahead.eu>, 
	Barnabás Pőcze <pobrn@...tonmail.com>, Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org, 
	linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org, dmitry.torokhov@...il.com, 
	Daniel Verkamp <dverkamp@...omium.org>, hughd@...gle.com, jorgelo@...omium.org, 
	skhan@...uxfoundation.org, Kees Cook <keescook@...omium.org>
Subject: Re: [PATCH v1] memfd: `MFD_NOEXEC_SEAL` should not imply
 `MFD_ALLOW_SEALING`

On 2024-05-23, Jeff Xu <jeffxu@...gle.com> wrote:
> On Thu, May 23, 2024 at 1:24 AM David Rheinsberg <david@...dahead.eu> wrote:
> >
> > Hi
> >
> > On Thu, May 23, 2024, at 4:25 AM, Barnabás Pőcze wrote:
> > > 2024. május 23., csütörtök 1:23 keltezéssel, Andrew Morton
> > > <akpm@...ux-foundation.org> írta:
> > >> It's a change to a userspace API, yes?  Please let's have a detailed
> > >> description of why this is OK.  Why it won't affect any existing users.
> > >
> > > Yes, it is a uAPI change. To trigger user visible change, a program has to
> > >
> > >  - create a memfd
> > >    - with MFD_NOEXEC_SEAL,
> > >    - without MFD_ALLOW_SEALING;
> > >  - try to add seals / check the seals.
> > >
> > > This change in essence reverts the kernel's behaviour to that of Linux
> > > <6.3, where
> > > only `MFD_ALLOW_SEALING` enabled sealing. If a program works correctly
> > > on those
> > > kernels, it will likely work correctly after this change.
> > >
> > > I have looked through Debian Code Search and GitHub, searching for
> > > `MFD_NOEXEC_SEAL`.
> > > And I could find only a single breakage that this change would case:
> > > dbus-broker
> > > has its own memfd_create() wrapper that is aware of this implicit
> > > `MFD_ALLOW_SEALING`
> > > behaviour[0], and tries to work around it. This workaround will break.
> > > Luckily,
> > > however, as far as I could tell this only affects the test suite of
> > > dbus-broker,
> > > not its normal operations, so I believe it should be fine. I have
> > > prepared a PR
> > > with a fix[1].
> >
> > We asked for exactly this fix before, so I very much support this. Our test-suite in `dbus-broker` merely verifies what the current kernel behavior is (just like the kernel selftests). I am certainly ok if the kernel breaks it. I will gladly adapt the test-suite.
> >
> > Previous discussion was in:
> >
> >     [PATCH] memfd: support MFD_NOEXEC alongside MFD_EXEC
> >     https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
> >
> > Note that this fix is particularly important in combination with `vm.memfd_noexec=2`, since this breaks existing user-space by enabling sealing on all memfds unconditionally. I also encourage backporting to stable kernels.
> >
> Also with vm.memfd_noexec=1.
> I think that problem must be addressed either with this patch, or with
> a new flag.
> 
> Regarding vm.memfd_noexec, on another topic.
> I think in addition to  vm.memfd_noexec = 1 and 2,  there still could
> be another state: 3
> 
> =0. Do nothing.
> =1. This will add MFD_NOEXEC_SEAL if application didn't set EXEC or
> MFD_NOEXE_SEAL (to help with the migration)
> =2: This will reject all calls without MFD_NOEXEC_SEAL (the whole
> system doesn't allow executable memfd)
> =3:  Application must set MFD_EXEC or MFD_NOEXEC_SEAL explicitly, or
> else it will be rejected.
> 
> 3 is useful because it lets applications choose what to use, and
> forces applications to migrate to new semantics (this is what 2 did
> before 9876cfe8).
> The caveat is 3 is less restrictive than 2, so must document it clearly.

As discussed at the time, "you must use this flag" is not a useful
setting for a general purpose operating system because it explicitly
disables backwards compatibility (breaking any application that was
written in the past 10 years!) for no reason other than "new is better".

As I suggested when we fixed the semantics of vm.memfd_noexec, if you
really want to block a particular flag from not being set, seccomp lets
you do this incredibly easily without acting as a footgun for admins.
Yes, vm.memfd_noexec can break programs that use executable memfds, but
that is the point of the sysctl -- making vm.memfd_noexec break programs
that don't use executable memfds (they are only guilty of being written
before mid-2023) is not useful.

In addition, making 3 less restrictive than 2 would make the original
restriction mechanism useless. A malicious process could raise the
setting to 3 and disable the "protection" (as discussed before, I really
don't understand the threat model here, but making it possible to
disable easily is pretty clearly). You could change the policy, but now
you're adding more complexity for a feature that IMO doesn't really make
sense in the first place.

> -Jeff
> 
> > Reviewed-by: David Rheinsberg <david@...dahead.eu>
> >
> > Thanks
> > David

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ