lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALmYWFs18vUwXx5p-VxNO5BZ0wvaHE54cG8n_+UdAL5-etAK=w@mail.gmail.com>
Date: Tue, 28 May 2024 10:56:12 -0700
From: Jeff Xu <jeffxu@...gle.com>
To: Aleksa Sarai <cyphar@...har.com>, Jeff Xu <jeffxu@...omium.org>
Cc: David Rheinsberg <david@...dahead.eu>, Barnabás Pőcze <pobrn@...tonmail.com>, 
	Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org, 
	linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org, 
	dmitry.torokhov@...il.com, Daniel Verkamp <dverkamp@...omium.org>, hughd@...gle.com, 
	jorgelo@...omium.org, skhan@...uxfoundation.org, 
	Kees Cook <keescook@...omium.org>
Subject: Re: [PATCH v1] memfd: `MFD_NOEXEC_SEAL` should not imply `MFD_ALLOW_SEALING`

Hi Aleksa,

On Fri, May 24, 2024 at 9:12 AM Aleksa Sarai <cyphar@...har.com> wrote:
>
> On 2024-05-23, Jeff Xu <jeffxu@...gle.com> wrote:

> > Regarding vm.memfd_noexec, on another topic.
> > I think in addition to  vm.memfd_noexec = 1 and 2,  there still could
> > be another state: 3
> >
> > =0. Do nothing.
> > =1. This will add MFD_NOEXEC_SEAL if application didn't set EXEC or
> > MFD_NOEXE_SEAL (to help with the migration)
> > =2: This will reject all calls without MFD_NOEXEC_SEAL (the whole
> > system doesn't allow executable memfd)
> > =3:  Application must set MFD_EXEC or MFD_NOEXEC_SEAL explicitly, or
> > else it will be rejected.
> >
> > 3 is useful because it lets applications choose what to use, and
> > forces applications to migrate to new semantics (this is what 2 did
> > before 9876cfe8).
> > The caveat is 3 is less restrictive than 2, so must document it clearly.
>
> As discussed at the time, "you must use this flag" is not a useful
> setting for a general purpose operating system because it explicitly
> disables backwards compatibility (breaking any application that was
> written in the past 10 years!) for no reason other than "new is better".
>
Are you referring to ratcheting in the sysctl in my original patch or
is this something else ?
I do not disagree with your change of  "removing the ratcheting" from
the admin point of view.

> As I suggested when we fixed the semantics of vm.memfd_noexec, if you
> really want to block a particular flag from not being set, seccomp lets
> you do this incredibly easily without acting as a footgun for admins.

seccomp can but it requires more work for the container, e.g.
container needs to allow-list all the syscalls. I'm trying to point
out that seccomp might not cover all user-cases.

"ratcheting" in the vm.memfd_noexec is lightweight  and can be applied
to the sandbox  of the container in advance, but since admin doesn't
like ratcheting in sysctl, maybe prctl or LSM are ways to implement
such restriction.

> Yes, vm.memfd_noexec can break programs that use executable memfds, but
> that is the point of the sysctl -- making vm.memfd_noexec break programs
> that don't use executable memfds (they are only guilty of being written
> before mid-2023) is not useful.
>
> In addition, making 3 less restrictive than 2 would make the original
> restriction mechanism useless. A malicious process could raise the
> setting to 3 and disable the "protection" (as discussed before, I really
> don't understand the threat model here, but making it possible to
> disable easily is pretty clearly).
> You could change the policy, but now
> you're adding more complexity for a feature that IMO doesn't really make
> sense in the first place.
>
The reason of 3 is help with migration (not for threat-model), e.g. a
container can force every apps run in the container migrates their
memfd_create  to use either MFD_EXEC or MFD_NOEXEC_SEAL.
But I understand what you mean, with current code,  adding 3 would
cause more confusion to vm.memfd_noexec. Perhaps a new sysctl or prctl
is the way to go if the app wants to force migration.
In the hinder sight: two sysctls would work betters:  the first deal
with migration, the second enforces NO_EXEC_SEAL.

Thanks
-Jeff


> > -Jeff
> >
> > > Reviewed-by: David Rheinsberg <david@...dahead.eu>
> > >
> > > Thanks
> > > David
>
> --
> Aleksa Sarai
> Senior Software Engineer (Containers)
> SUSE Linux GmbH
> <https://www.cyphar.com/>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ