lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1ehtkapn9.fsf@fess.ebiederm.org>
Date:	Thu, 23 Feb 2012 19:14:50 -0800
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Christoph Lameter <cl@...ux.com>
Cc:	Dave Hansen <dave@...ux.vnet.ibm.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [RFC][PATCH] fix move/migrate_pages() race on task struct

Christoph Lameter <cl@...ux.com> writes:

> On Thu, 23 Feb 2012, Dave Hansen wrote:
>
>> > We may at this point be getting a reference to a task struct from another
>> > process not only from the current process (where the above procedure is
>> > valid). You rightly pointed out that the slab rcu free mechanism allows a
>> > free and a reallocation within the RCU period.
>>
>> I didn't _mean_ to point that out, but I think I realize what you're
>> talking about.  What we have before this patch is this:
>>
>>         rcu_read_lock();
>>         task = pid ? find_task_by_vpid(pid) : current;
>
> We take a refcount here on the mm ... See the code. We could simply take a
> refcount on the task as well if this is considered safe enough. If we have
> a refcount on the task then we do not need the refcount on the mm. Thats
> was your approach...
>
>>         rcu_read_unlock();
>
>> > Is that a real difference or are you just playing with words?
>>
>> I think we're talking about two different things:
>> 1. does RCU protect the pid->task lookup sufficiently?
>
> I dont know

Yes.  See below.

>> 2. Can the task simply go away in the move/migrate_pages() calls?
>
> The task may go away but we need the mm to stay for migration.
> That is why a refcount is taken on the mm.
>
> The bug in migrate_pages() is that we do a rcu_unlock and a rcu_lock. If
> we drop those then we should be safe if the use of a task pointer within a
> rcu section is safe without taking a refcount.

Yes the user of a task_struct pointer found via a userspace pid is valid
for the life of an rcu critical section, and the bug is indeed that we
drop the rcu_lock and somehow expect the task to remain valid.

The guarantee comes from release_task.  In release_task we call
__exit_signal which calls __unhash_process, and then we call
delayed_put_task to guarantee that the task lives until the end
of the rcu interval.



In migrate_pages we have a lot of task accesses outside of the
rcu critical section, and without a reference count on task.

I tell you the truth trying to figure out what that code needs to be
correct if task != current makes my head hurt.

I think we need to grab a reference on task_struct, to stop the task
from going away, and in addition we need to hold task_lock.  To keep
task->mm from changing (see exec_mmap).  But we can't do that and sleep
so I think the entire function needs to be rewritten, and the need for
task deep in the migrate_pages path needs to be removed as even with the
reference count held we can race with someone calling exec.

The only easy fix I see is to add:
if (pid)
	return -EINVAL;

Then we are working with current and only current change it's mm making
things much, much, much simpler.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ