linux-kernel - Re: [PATCH] exec: fix the racy usage of fs_struct->in

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250429154944.GA18907@redhat.com>
Date: Tue, 29 Apr 2025 17:49:44 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: brauner@...nel.org, kees@...nel.org, viro@...iv.linux.org.uk
Cc: jack@...e.cz, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	syzkaller-bugs@...glegroups.com, mjguzik@...il.com,
	Michal Hocko <mhocko@...e.com>
Subject: Re: [PATCH] exec: fix the racy usage of fs_struct->in_exec

Damn, I am stupid.

On 03/24, Oleg Nesterov wrote:
>
> check_unsafe_exec() sets fs->in_exec under cred_guard_mutex, then execve()
> paths clear fs->in_exec lockless. This is fine if exec succeeds, but if it
> fails we have the following race:
>
> 	T1 sets fs->in_exec = 1, fails, drops cred_guard_mutex
>
> 	T2 sets fs->in_exec = 1
>
> 	T1 clears fs->in_exec

When I look at this code again, I think this race was not possible and thus
this patch (applied as af7bb0d2ca45) was not needed.

Yes, begin_new_exec() can drop cred_guard_mutex on failure, but only after
de_thread() succeeds, when we can't race with another sub-thread.

I hope this patch didn't make the things worse so we don't need to revert it.
Plus I think it makes this (confusing) logic a bit more clear. Just, unless
I am confused again, it wasn't really needed.

-----------------------------------------------------------------------------
But. I didn't read the original report from syzbot,
https://lore.kernel.org/all/67dc67f0.050a0220.25ae54.001f.GAE@google.com/#t
because I wasn't CC'ed. and then - sorry Kees!!! - I didn't bother to read
your first reply carefully.

So yes, with or without this patch the "if (fs->in_exec)" check in copy_fs()
can obviously hit the 1 -> 0 transition.

This is harmless, but should be probably fixed just to avoid another report
from KCSAN.

I do not want to add another spin_lock(fs->lock). We can change copy_fs() to
use data_race(), but I'd prefer the patch below. Yes, it needs the additional
comment(s) to explain READ_ONCE().

What do you think? Did I miss somthing again??? Quite possibly...

Mateusz, I hope you will cleanup this horror sooner or later ;)

Oleg.
---

diff --git a/fs/exec.c b/fs/exec.c
index 5d1c0d2dc403..42a7f9b43911 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1495,7 +1495,7 @@ static void free_bprm(struct linux_binprm *bprm)
 	free_arg_pages(bprm);
 	if (bprm->cred) {
 		/* in case exec fails before de_thread() succeeds */
-		current->fs->in_exec = 0;
+		WRITE_ONCE(current->fs->in_exec, 0);
 		mutex_unlock(&current->signal->cred_guard_mutex);
 		abort_creds(bprm->cred);
 	}
diff --git a/kernel/fork.c b/kernel/fork.c
index 4c2df3816728..381af8c8ece8 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1802,7 +1802,7 @@ static int copy_fs(unsigned long clone_flags, struct task_struct *tsk)
 		/* tsk->fs is already what we want */
 		spin_lock(&fs->lock);
 		/* "users" and "in_exec" locked for check_unsafe_exec() */
-		if (fs->in_exec) {
+		if (READ_ONCE(fs->in_exec)) {
 			spin_unlock(&fs->lock);
 			return -EAGAIN;
 		}