lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <ed418e43ad28b8688cfea2b7c90fce1c@ispras.ru>
Date:   Tue, 30 Aug 2022 22:49:43 +0300
From:   Alexey Izbyshev <izbyshev@...ras.ru>
To:     Andrei Vagin <avagin@...il.com>
Cc:     linux-kernel@...r.kernel.org,
        Dmitry Safonov <0x7f454c46@...il.com>,
        Christian Brauner <brauner@...nel.org>,
        Florian Weimer <fweimer@...hat.com>, linux-mm@...ck.org,
        Eric Biederman <ebiederm@...ssion.com>,
        Kees Cook <keescook@...omium.org>
Subject: Potentially undesirable interactions between vfork() and time
 namespaces

Hi,

I've looked at Andrei's patch[1] that permitted vfork() after 
unshare(CLONE_NEWTIME) and noticed a couple of odd things that I'd like 
to point out.

  	/*
  	 * If the new process will be in a different time namespace
  	 * do not allow it to share VM or a thread group with the forking 
task.
+	 *
+	 * On vfork, the child process enters the target time namespace only
+	 * after exec.
  	 */
-	if (clone_flags & (CLONE_THREAD | CLONE_VM)) {
+	if ((clone_flags & (CLONE_VM | CLONE_VFORK)) == CLONE_VM) {
  		if (nsp->time_ns != nsp->time_ns_for_children)
  			return ERR_PTR(-EINVAL);
  	}

This change permits not only a normal vfork(), but also 
clone(CLONE_VM|CLONE_VFORK|CLONE_SIGHAND|CLONE_THREAD). I'm not sure 
whether it can cause real harm, but it's pretty inconsistent to forbid 
creation of normal threads after unshare(CLONE_NEWTIME), but permit such 
weird ones, so maybe the check should be strengthened.

Also, if such a thread execs, no time namespace switch will happen 
because it's vfork_done field will be cleared when its creator (a 
sibling thread) is killed by de_thread().

+       vfork = !!tsk->vfork_done;
         old_mm = current->mm;
         exec_mm_release(tsk, old_mm);
         if (old_mm)
@@ -1030,6 +1033,10 @@ static int exec_mmap(struct mm_struct *mm)
         tsk->mm->vmacache_seqnum = 0;
         vmacache_flush(tsk);
         task_unlock(tsk);
+
+       if (vfork)
+               timens_on_fork(tsk->nsproxy, tsk);
+

Similarly, even after a normal vfork(), time namespace switch could be 
silently skipped if the parent dies before "tsk->vfork_done" is read. 
Again, I don't know whether anybody cares, but this behavior seems 
non-obvious and probably unintended to me.

Thanks,
Alexey

[1] 
https://lore.kernel.org/all/20220613060723.197407-1-avagin@gmail.com/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ