linux-kernel - Re: [RFC PATCH v1 1/2] fs: Add O_DENY

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABi2SkUJ1PDm_uri=4o+C13o5wFQD=xA7zVKU-we+unsEDm3dw@mail.gmail.com>
Date: Tue, 26 Aug 2025 13:29:55 -0700
From: Jeff Xu <jeffxu@...omium.org>
To: Mickaël Salaün <mic@...ikod.net>
Cc: Jeff Xu <jeffxu@...gle.com>, Andy Lutomirski <luto@...capital.net>, Jann Horn <jannh@...gle.com>, 
	Al Viro <viro@...iv.linux.org.uk>, Christian Brauner <brauner@...nel.org>, 
	Kees Cook <keescook@...omium.org>, Paul Moore <paul@...l-moore.com>, 
	Serge Hallyn <serge@...lyn.com>, Andy Lutomirski <luto@...nel.org>, Arnd Bergmann <arnd@...db.de>, 
	Christian Heimes <christian@...hon.org>, Dmitry Vyukov <dvyukov@...gle.com>, 
	Elliott Hughes <enh@...gle.com>, Fan Wu <wufan@...ux.microsoft.com>, 
	Florian Weimer <fweimer@...hat.com>, Jonathan Corbet <corbet@....net>, 
	Jordan R Abrahams <ajordanr@...gle.com>, Lakshmi Ramasubramanian <nramas@...ux.microsoft.com>, 
	Luca Boccassi <bluca@...ian.org>, Matt Bobrowski <mattbobrowski@...gle.com>, 
	Miklos Szeredi <mszeredi@...hat.com>, Mimi Zohar <zohar@...ux.ibm.com>, 
	Nicolas Bouchinet <nicolas.bouchinet@....cyber.gouv.fr>, Robert Waite <rowait@...rosoft.com>, 
	Roberto Sassu <roberto.sassu@...wei.com>, Scott Shell <scottsh@...rosoft.com>, 
	Steve Dower <steve.dower@...hon.org>, Steve Grubb <sgrubb@...hat.com>, 
	kernel-hardening@...ts.openwall.com, linux-api@...r.kernel.org, 
	linux-fsdevel@...r.kernel.org, linux-integrity@...r.kernel.org, 
	linux-kernel@...r.kernel.org, linux-security-module@...r.kernel.org
Subject: Re: [RFC PATCH v1 1/2] fs: Add O_DENY_WRITE

Hi Mickaël

On Tue, Aug 26, 2025 at 5:39 AM Mickaël Salaün <mic@...ikod.net> wrote:
>
> On Mon, Aug 25, 2025 at 10:57:57AM -0700, Jeff Xu wrote:
> > Hi Mickaël
> >
> > On Mon, Aug 25, 2025 at 2:31 AM Mickaël Salaün <mic@...ikod.net> wrote:
> > >
> > > On Sun, Aug 24, 2025 at 11:04:03AM -0700, Andy Lutomirski wrote:
> > > > On Sun, Aug 24, 2025 at 4:03 AM Mickaël Salaün <mic@...ikod.net> wrote:
> > > > >
> > > > > On Fri, Aug 22, 2025 at 09:45:32PM +0200, Jann Horn wrote:
> > > > > > On Fri, Aug 22, 2025 at 7:08 PM Mickaël Salaün <mic@...ikod.net> wrote:
> > > > > > > Add a new O_DENY_WRITE flag usable at open time and on opened file (e.g.
> > > > > > > passed file descriptors).  This changes the state of the opened file by
> > > > > > > making it read-only until it is closed.  The main use case is for script
> > > > > > > interpreters to get the guarantee that script' content cannot be altered
> > > > > > > while being read and interpreted.  This is useful for generic distros
> > > > > > > that may not have a write-xor-execute policy.  See commit a5874fde3c08
> > > > > > > ("exec: Add a new AT_EXECVE_CHECK flag to execveat(2)")
> > > > > > >
> > > > > > > Both execve(2) and the IOCTL to enable fsverity can already set this
> > > > > > > property on files with deny_write_access().  This new O_DENY_WRITE make
> > > > > >
> > > > > > The kernel actually tried to get rid of this behavior on execve() in
> > > > > > commit 2a010c41285345da60cece35575b4e0af7e7bf44.; but sadly that had
> > > > > > to be reverted in commit 3b832035387ff508fdcf0fba66701afc78f79e3d
> > > > > > because it broke userspace assumptions.
> > > > >
> > > > > Oh, good to know.
> > > > >
> > > > > >
> > > > > > > it widely available.  This is similar to what other OSs may provide
> > > > > > > e.g., opening a file with only FILE_SHARE_READ on Windows.
> > > > > >
> > > > > > We used to have the analogous mmap() flag MAP_DENYWRITE, and that was
> > > > > > removed for security reasons; as
> > > > > > https://man7.org/linux/man-pages/man2/mmap.2.html says:
> > > > > >
> > > > > > |        MAP_DENYWRITE
> > > > > > |               This flag is ignored.  (Long ago—Linux 2.0 and earlier—it
> > > > > > |               signaled that attempts to write to the underlying file
> > > > > > |               should fail with ETXTBSY.  But this was a source of denial-
> > > > > > |               of-service attacks.)"
> > > > > >
> > > > > > It seems to me that the same issue applies to your patch - it would
> > > > > > allow unprivileged processes to essentially lock files such that other
> > > > > > processes can't write to them anymore. This might allow unprivileged
> > > > > > users to prevent root from updating config files or stuff like that if
> > > > > > they're updated in-place.
> > > > >
> > > > > Yes, I agree, but since it is the case for executed files I though it
> > > > > was worth starting a discussion on this topic.  This new flag could be
> > > > > restricted to executable files, but we should avoid system-wide locks
> > > > > like this.  I'm not sure how Windows handle these issues though.
> > > > >
> > > > > Anyway, we should rely on the access control policy to control write and
> > > > > execute access in a consistent way (e.g. write-xor-execute).  Thanks for
> > > > > the references and the background!
> > > >
> > > > I'm confused.  I understand that there are many contexts in which one
> > > > would want to prevent execution of unapproved content, which might
> > > > include preventing a given process from modifying some code and then
> > > > executing it.
> > > >
> > > > I don't understand what these deny-write features have to do with it.
> > > > These features merely prevent someone from modifying code *that is
> > > > currently in use*, which is not at all the same thing as preventing
> > > > modifying code that might get executed -- one can often modify
> > > > contents *before* executing those contents.
> > >
> > > The order of checks would be:
> > > 1. open script with O_DENY_WRITE
> > > 2. check executability with AT_EXECVE_CHECK
> > > 3. read the content and interpret it
> > >
> > I'm not sure about the O_DENY_WRITE approach, but the problem is worth solving.
> >
> > AT_EXECVE_CHECK is not just for scripting languages. It could also
> > work with bytecodes like Java, for example. If we let the Java runtime
> > call AT_EXECVE_CHECK before loading the bytecode, the LSM could
> > develop a policy based on that.
>
> Sure, I'm using "script" to make it simple, but this applies to other
> use cases.
>
That makes sense.

> >
> > > The deny-write feature was to guarantee that there is no race condition
> > > between step 2 and 3.  All these checks are supposed to be done by a
> > > trusted interpreter (which is allowed to be executed).  The
> > > AT_EXECVE_CHECK call enables the caller to know if the kernel (and
> > > associated security policies) allowed the *current* content of the file
> > > to be executed.  Whatever happen before or after that (wrt.
> > > O_DENY_WRITE) should be covered by the security policy.
> > >
> > Agree, the race problem needs to be solved in order for AT_EXECVE_CHECK.
> >
> > Enforcing non-write for the path that stores scripts or bytecodes can
> > be challenging due to historical or backward compatibility reasons.
> > Since AT_EXECVE_CHECK provides a mechanism to check the file right
> > before it is used, we can assume it will detect any "problem" that
> > happened before that, (e.g. the file was overwritten). However, that
> > also imposes two additional requirements:
> > 1> the file doesn't change while AT_EXECVE_CHECK does the check.
>
> This is already the case, so any kind of LSM checks are good.
>
May I ask how this is done? some code in do_open_execat() does this ?
Apologies if this is a basic question.

> > 2>The file content kept by the process remains unchanged after passing
> > the AT_EXECVE_CHECK.
>
> The goal of this patch was to avoid such race condition in the case
> where executable files can be updated.  But in most cases it should not
> be a security issue (because processes allowed to write to executable
> files should be trusted), but this could still lead to bugs (because of
> inconsistent file content, half-updated).
>
There is also a time gap between:
a> the time of AT_EXECVE_CHECK
b> the time that the app opens the file for execution.
right ? another potential attack path (though this is not the case I
mentioned previously).

For the case I mentioned previously, I have to think more if the race
condition is a bug or security issue.
IIUC, two solutions are discussed so far:
1> the process could write to fs to update the script.  However, for
execution, the process still uses the copy that passed the
AT_EXECVE_CHECK. (snapshot solution by Andy Lutomirski)
or 2> the process blocks the write while opening the file as read only
and executing the script. (this seems to be the approach of this
patch).

I wonder if there are other ideas.

Thanks and regards,
-Jeff