[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0904191710320.9099@blonde.anvils>
Date: Sun, 19 Apr 2009 17:30:02 +0100 (BST)
From: Hugh Dickins <hugh@...itas.com>
To: Oleg Nesterov <oleg@...hat.com>
cc: Al Viro <viro@...IV.linux.org.uk>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Joe Malicki <jmalicki@...acarta.com>,
Michael Itz <mitz@...acarta.com>,
Kenneth Baker <bakerk@...acarta.com>,
Chris Wright <chrisw@...s-sol.org>,
David Howells <dhowells@...hat.com>,
Alexey Dobriyan <adobriyan@...il.com>,
Greg Kroah-Hartman <gregkh@...e.de>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: Q: check_unsafe_exec() races (Was: [PATCH 2/4] fix setuid
sometimes doesn't)
On Mon, 6 Apr 2009, Oleg Nesterov wrote:
> On 04/01, Al Viro wrote:
> >
> > Rebased and pushed (same tree, same branch; included into for-next, along
> > with related cleanups).
>
> Sorry for delay!
Please don't suppose that you can ever beat me at the slowness game!
>
> Afaics, the usage of fs->in_exec is not completely right. But firstly, a
> couple of minor nits.
>
>
> check_unsafe_exec() doesn't need ->siglock, we can iterate over sub-threads
> under rcu_read_lock(). Note that with RCU or ->siglock we can set the "wrong"
> LSM_UNSAFE_SHARE if we race with copy_process(CLONE_THREAD | CLONE_FS), but
> as it was already discussed we don't care. This means it is OK to miss the
> freshly cloned thread which has already passed copy_fs().
Yes, I agree.
And preferable not to have IRQs disabled over that next_thread() loop.
>
>
> do_execve:
>
> /* execve succeeded */
> write_lock(¤t->fs->lock);
> current->fs->in_exec = 0;
> write_unlock(¤t->fs->lock);
>
> afaics, fs->lock is not needed. If ->in_exec was set, it was set by this
> thread-group and we do not share ->fs with another process. Since we are
> the only thread now, we can clear ->in_exec lockless.
Right, given your fix below. I wondered for a moment if a barrier
would then be needed, but no, this is all racy (erring on the safe
side) if the userspace insists on being racy here.
>
>
> And now, what I think is wrong:
>
> do_execve:
>
> out_unmark:
> write_lock(¤t->fs->lock);
> current->fs->in_exec = 0;
> write_unlock(¤t->fs->lock);
>
> Two threads T1 and T2 and another process P, all share the same ->fs.
>
> T1 starts do_execve(BAD_FILE). It calls check_unsafe_exec(), since ->fs is
> shared, we set LSM_UNSAFE but not ->in_exec (actually, not very good name).
>
> P exits and decrements fs->users.
>
> T2 starts do_execve(), calls check_unsafe_exec(), now ->fs is not shared,
> we set fs->in_exec.
>
> T1 continues, open_exec(BAD_FILE) fails, we clear ->in_exec and return
> to the user-space.
>
> T1 does clone(CLONE_FS /* without CLONE_THREAD */).
>
> T1 continues without LSM_UNSAFE_SHARE while ->fs is shared with another
> process.
If I follow you correctly, you meant to say T2 not T1 in the last step.
>
>
> What do you think about the (uncompiled) patch below ? It doesn't change
> compat_do_execve(), just for discussion.
>
> But see also another message I am going to send...
>
> Oleg.
>
> do_execve() must not clear fs->in_exec if it was set by another thread,
> and we don't need fs->lock to clear.
>
> Also, s/lock_task_sighand/rcu_read_lock/ in check_unsafe_exec().
Yes, I think your clear_in_exec change is a necessary one,
and your rcu_read_lock well worth while.
One tiny change (aside from extending to compat_do_execve):
Al originally had check_unsafe_exec()'s write_lock(&p->fs->lock)
after the lock_task_sighand(p, &flags), but was forced to invert
that by the IRQ issue lockdep flagged. I think we'd all prefer
to think of fs->lock as an innermost lock, and would like it
now to go after your rcu_read_lock().
(You do rcu_read_unlock() earlier, but that's okay.)
Hugh
>
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1060,7 +1060,6 @@ EXPORT_SYMBOL(install_exec_creds);
> int check_unsafe_exec(struct linux_binprm *bprm)
> {
> struct task_struct *p = current, *t;
> - unsigned long flags;
> unsigned n_fs;
> int res = 0;
>
> @@ -1068,11 +1067,12 @@ int check_unsafe_exec(struct linux_binpr
>
> n_fs = 1;
> write_lock(&p->fs->lock);
> - lock_task_sighand(p, &flags);
> + rcu_read_lock();
> for (t = next_thread(p); t != p; t = next_thread(t)) {
> if (t->fs == p->fs)
> n_fs++;
> }
> + rcu_read_unlock();
>
> if (p->fs->users > n_fs) {
> bprm->unsafe |= LSM_UNSAFE_SHARE;
> @@ -1080,9 +1080,8 @@ int check_unsafe_exec(struct linux_binpr
> if (p->fs->in_exec)
> res = -EAGAIN;
> p->fs->in_exec = 1;
> + res = 1;
> }
> -
> - unlock_task_sighand(p, &flags);
> write_unlock(&p->fs->lock);
>
> return res;
> @@ -1284,6 +1283,7 @@ int do_execve(char * filename,
> struct linux_binprm *bprm;
> struct file *file;
> struct files_struct *displaced;
> + bool clear_in_exec;
> int retval;
>
> retval = unshare_files(&displaced);
> @@ -1306,8 +1306,9 @@ int do_execve(char * filename,
> goto out_unlock;
>
> retval = check_unsafe_exec(bprm);
> - if (retval)
> + if (retval < 0)
> goto out_unlock;
> + clear_in_exec = retval;
>
> file = open_exec(filename);
> retval = PTR_ERR(file);
> @@ -1355,9 +1356,7 @@ int do_execve(char * filename,
> goto out;
>
> /* execve succeeded */
> - write_lock(¤t->fs->lock);
> current->fs->in_exec = 0;
> - write_unlock(¤t->fs->lock);
> current->in_execve = 0;
> mutex_unlock(¤t->cred_exec_mutex);
> acct_update_integrals(current);
> @@ -1377,9 +1376,8 @@ out_file:
> }
>
> out_unmark:
> - write_lock(¤t->fs->lock);
> - current->fs->in_exec = 0;
> - write_unlock(¤t->fs->lock);
> + if (clear_in_exec)
> + current->fs->in_exec = 0;
>
> out_unlock:
> current->in_execve = 0;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists