linux-kernel - Re: Q: check_unsafe_exec() races (Was: [PATCH 2/4] fix setuid sometimes doesn't)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090406155103.GB21220@redhat.com>
Date:	Mon, 6 Apr 2009 17:51:03 +0200
From:	Oleg Nesterov <oleg@...hat.com>
To:	Al Viro <viro@...IV.linux.org.uk>
Cc:	Hugh Dickins <hugh@...itas.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Joe Malicki <jmalicki@...acarta.com>,
	Michael Itz <mitz@...acarta.com>,
	Kenneth Baker <bakerk@...acarta.com>,
	Chris Wright <chrisw@...s-sol.org>,
	David Howells <dhowells@...hat.com>,
	Alexey Dobriyan <adobriyan@...il.com>,
	Greg Kroah-Hartman <gregkh@...e.de>,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: Q: check_unsafe_exec() races (Was: [PATCH 2/4] fix setuid
	sometimes doesn't)

On 04/01, Al Viro wrote:
>
> On Wed, Apr 01, 2009 at 01:28:01AM +0100, Hugh Dickins wrote:
>
> > Otherwise it looks good to me, except I keep worrying about those
> > EAGAINs.
>
> Frankly, -EAGAIN in situation when we have userland race is fine.  And
> we *do* have a userland race here - execve() will kill -9 those threads
> in case of success, so if they'd been doing something useful, they are
> about to be suddenly screwed.

Can't resist! I dislike the "in_exec && -EAGAIN" oddity too.

Yes sure, we can't break the "well written" applications. But imho this
looks strange. And a bit "assymetrical" wrt LSM_UNSAFE_SHARE, I mean
check_unsafe_exec() allows sub-threads to race or CLONE_FS but only if
LSM_UNSAFE_SHARE.

Another reason, we can have the "my test-case found something strange"
bug-reports.

So. Please feel free to nack or just ignore this message, but since I
personally dislike the current behaviour I should at least try to suggest
something else.

	- add "wait_queue_head_t in_exec_wait" to "struct linux_binprm".

	- kill fs->in_exec, add "wait_queue_head_t *in_exec_wait_ptr"
	  instead.

	- introduce the new helper,

		void fs_lock_check_exec(struct fs_struct *fs)
		{
			write_lock(&fs->lock);
			while (unlikely(fs->in_exec_wait_ptr)) {
				DECLARE_WAITQUEUE(wait, current);

				if (fatal_signal_pending(current))
					/*
					 * clone/exec can't succeed, and this
					 * thread won't return to the user-space
					 */
					break;

				__add_wait_queue(fs->in_exec_wait_ptr, &wait);
				__set_current_state(TASK_KILLABLE);
				write_unlock(&fs->lock);

				schedule();

				write_lock(&fs->lock);
				__remove_wait_queue(&wait);
			}
		}

	  Or we can return -EANYTHING when fatal_signal_pending(), this doesn't
	  matter.

	  Note that this helper can block only if we race with our sub-thread
	  in the middle of !LSM_UNSAFE_SHARE exec. Otherwise this is not slower
	  than write_lock(fs->lock) + if (fs->in_exec) we currently have.


	 - change copy_fs() to do

		if (clone_flags & CLONE_FS) {
			fs_lock_check_exec(fs);
			fs->users++;
			write_unlock(&fs->lock);
			return 0;
		}


	- change check_unsafe_exec:

		void check_unsafe_exec(struct linux_binprm *bprm)
		{
			struct task_struct *p = current, *t;
			unsigned n_fs;

			bprm->unsafe = tracehook_unsafe_exec(p);

			n_fs = 1;
			fs_lock_check_exec(&p->fs);
			if (p->fs->in_exec_wait_ptr)
				/* we are going to die */
				goto out;

			rcu_read_lock();
			for (t = next_thread(p); t != p; t = next_thread(t)) {
				if (t->fs == p->fs)
					n_fs++;
			}
			rcu_read_unlock();

			if (p->fs->users > n_fs) {
				bprm->unsafe |= LSM_UNSAFE_SHARE;
			} else {
				bprm->unsafe |= __LSM_EXEC_WAKE;
				init_waitqueue_head(&bprm->in_exec_wait);
				p->fs->in_exec_wait_ptr = &bprm->in_exec_wait;
			}
		out:
			write_unlock(&p->fs->lock);
		}



	 - and, finally, change do_execve()

			/* execve succeeded */
			current->fs->in_exec_wait_ptr = NULL;

			...

		out_unmark:
			if (bprm->unsafe & __LSM_EXEC_WAKE) {
				write_lock(&current->fs->lock);
				current->fs->in_exec_wait_ptr = NULL;
				wake_up_locked(&bprm->in_exec_wait);
				write_unlock(&current->fs->lock);
			}

Comments?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/