linux-kernel - Re: [RFC PATCH v1 1/2] fs: Add O_DENY

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250826.eWi6chuayae4@digikod.net>
Date: Tue, 26 Aug 2025 14:39:47 +0200
From: Mickaël Salaün <mic@...ikod.net>
To: Jeff Xu <jeffxu@...gle.com>
Cc: Andy Lutomirski <luto@...capital.net>, Jann Horn <jannh@...gle.com>, 
	Al Viro <viro@...iv.linux.org.uk>, Christian Brauner <brauner@...nel.org>, 
	Kees Cook <keescook@...omium.org>, Paul Moore <paul@...l-moore.com>, 
	Serge Hallyn <serge@...lyn.com>, Andy Lutomirski <luto@...nel.org>, Arnd Bergmann <arnd@...db.de>, 
	Christian Heimes <christian@...hon.org>, Dmitry Vyukov <dvyukov@...gle.com>, 
	Elliott Hughes <enh@...gle.com>, Fan Wu <wufan@...ux.microsoft.com>, 
	Florian Weimer <fweimer@...hat.com>, Jonathan Corbet <corbet@....net>, 
	Jordan R Abrahams <ajordanr@...gle.com>, Lakshmi Ramasubramanian <nramas@...ux.microsoft.com>, 
	Luca Boccassi <bluca@...ian.org>, Matt Bobrowski <mattbobrowski@...gle.com>, 
	Miklos Szeredi <mszeredi@...hat.com>, Mimi Zohar <zohar@...ux.ibm.com>, 
	Nicolas Bouchinet <nicolas.bouchinet@....cyber.gouv.fr>, Robert Waite <rowait@...rosoft.com>, 
	Roberto Sassu <roberto.sassu@...wei.com>, Scott Shell <scottsh@...rosoft.com>, 
	Steve Dower <steve.dower@...hon.org>, Steve Grubb <sgrubb@...hat.com>, 
	kernel-hardening@...ts.openwall.com, linux-api@...r.kernel.org, linux-fsdevel@...r.kernel.org, 
	linux-integrity@...r.kernel.org, linux-kernel@...r.kernel.org, 
	linux-security-module@...r.kernel.org, Jeff Xu <jeffxu@...omium.org>
Subject: Re: [RFC PATCH v1 1/2] fs: Add O_DENY_WRITE

On Mon, Aug 25, 2025 at 10:57:57AM -0700, Jeff Xu wrote:
> Hi Mickaël
> 
> On Mon, Aug 25, 2025 at 2:31 AM Mickaël Salaün <mic@...ikod.net> wrote:
> >
> > On Sun, Aug 24, 2025 at 11:04:03AM -0700, Andy Lutomirski wrote:
> > > On Sun, Aug 24, 2025 at 4:03 AM Mickaël Salaün <mic@...ikod.net> wrote:
> > > >
> > > > On Fri, Aug 22, 2025 at 09:45:32PM +0200, Jann Horn wrote:
> > > > > On Fri, Aug 22, 2025 at 7:08 PM Mickaël Salaün <mic@...ikod.net> wrote:
> > > > > > Add a new O_DENY_WRITE flag usable at open time and on opened file (e.g.
> > > > > > passed file descriptors).  This changes the state of the opened file by
> > > > > > making it read-only until it is closed.  The main use case is for script
> > > > > > interpreters to get the guarantee that script' content cannot be altered
> > > > > > while being read and interpreted.  This is useful for generic distros
> > > > > > that may not have a write-xor-execute policy.  See commit a5874fde3c08
> > > > > > ("exec: Add a new AT_EXECVE_CHECK flag to execveat(2)")
> > > > > >
> > > > > > Both execve(2) and the IOCTL to enable fsverity can already set this
> > > > > > property on files with deny_write_access().  This new O_DENY_WRITE make
> > > > >
> > > > > The kernel actually tried to get rid of this behavior on execve() in
> > > > > commit 2a010c41285345da60cece35575b4e0af7e7bf44.; but sadly that had
> > > > > to be reverted in commit 3b832035387ff508fdcf0fba66701afc78f79e3d
> > > > > because it broke userspace assumptions.
> > > >
> > > > Oh, good to know.
> > > >
> > > > >
> > > > > > it widely available.  This is similar to what other OSs may provide
> > > > > > e.g., opening a file with only FILE_SHARE_READ on Windows.
> > > > >
> > > > > We used to have the analogous mmap() flag MAP_DENYWRITE, and that was
> > > > > removed for security reasons; as
> > > > > https://man7.org/linux/man-pages/man2/mmap.2.html says:
> > > > >
> > > > > |        MAP_DENYWRITE
> > > > > |               This flag is ignored.  (Long ago—Linux 2.0 and earlier—it
> > > > > |               signaled that attempts to write to the underlying file
> > > > > |               should fail with ETXTBSY.  But this was a source of denial-
> > > > > |               of-service attacks.)"
> > > > >
> > > > > It seems to me that the same issue applies to your patch - it would
> > > > > allow unprivileged processes to essentially lock files such that other
> > > > > processes can't write to them anymore. This might allow unprivileged
> > > > > users to prevent root from updating config files or stuff like that if
> > > > > they're updated in-place.
> > > >
> > > > Yes, I agree, but since it is the case for executed files I though it
> > > > was worth starting a discussion on this topic.  This new flag could be
> > > > restricted to executable files, but we should avoid system-wide locks
> > > > like this.  I'm not sure how Windows handle these issues though.
> > > >
> > > > Anyway, we should rely on the access control policy to control write and
> > > > execute access in a consistent way (e.g. write-xor-execute).  Thanks for
> > > > the references and the background!
> > >
> > > I'm confused.  I understand that there are many contexts in which one
> > > would want to prevent execution of unapproved content, which might
> > > include preventing a given process from modifying some code and then
> > > executing it.
> > >
> > > I don't understand what these deny-write features have to do with it.
> > > These features merely prevent someone from modifying code *that is
> > > currently in use*, which is not at all the same thing as preventing
> > > modifying code that might get executed -- one can often modify
> > > contents *before* executing those contents.
> >
> > The order of checks would be:
> > 1. open script with O_DENY_WRITE
> > 2. check executability with AT_EXECVE_CHECK
> > 3. read the content and interpret it
> >
> I'm not sure about the O_DENY_WRITE approach, but the problem is worth solving.
> 
> AT_EXECVE_CHECK is not just for scripting languages. It could also
> work with bytecodes like Java, for example. If we let the Java runtime
> call AT_EXECVE_CHECK before loading the bytecode, the LSM could
> develop a policy based on that.

Sure, I'm using "script" to make it simple, but this applies to other
use cases.

> 
> > The deny-write feature was to guarantee that there is no race condition
> > between step 2 and 3.  All these checks are supposed to be done by a
> > trusted interpreter (which is allowed to be executed).  The
> > AT_EXECVE_CHECK call enables the caller to know if the kernel (and
> > associated security policies) allowed the *current* content of the file
> > to be executed.  Whatever happen before or after that (wrt.
> > O_DENY_WRITE) should be covered by the security policy.
> >
> Agree, the race problem needs to be solved in order for AT_EXECVE_CHECK.
> 
> Enforcing non-write for the path that stores scripts or bytecodes can
> be challenging due to historical or backward compatibility reasons.
> Since AT_EXECVE_CHECK provides a mechanism to check the file right
> before it is used, we can assume it will detect any "problem" that
> happened before that, (e.g. the file was overwritten). However, that
> also imposes two additional requirements:
> 1> the file doesn't change while AT_EXECVE_CHECK does the check.

This is already the case, so any kind of LSM checks are good.

> 2>The file content kept by the process remains unchanged after passing
> the AT_EXECVE_CHECK.

The goal of this patch was to avoid such race condition in the case
where executable files can be updated.  But in most cases it should not
be a security issue (because processes allowed to write to executable
files should be trusted), but this could still lead to bugs (because of
inconsistent file content, half-updated).

> 
> I imagine, the complete solution for AT_EXECVE_CHECK would include
> those two grantees.

There is no issue directly with AT_EXECVE_CHECK, but according to the
system configuration, interpreters could read a file that was updated
after the AT_EXECVE_CHECK.  This should not be an issue for secure
systems where executable files are only updated with trusted code,
except if the update mechanism is not atomic.  The main use case for
this patch series was for generic distros that may not have the
write-xor-execute guarantees e.g., for developers.

The only viable solution I see would be to have some kind of snapshot of
files, requested by interpreters, but I'm not sure if it is worth it.