lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <33631413674011@web7o.yandex.ru>
Date:	Sun, 19 Oct 2014 03:13:31 +0400
From:	Kirill Tkhai <tkhai@...dex.ru>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Kirill Tkhai <ktkhai@...allels.com>,
	Peter Zijlstra <peterz@...radead.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...hat.com>,
	Vladimir Davydov <vdavydov@...allels.com>
Subject: Re: [PATCH] sched/numa: fix unsafe get_task_struct() in task_numa_assign()

19.10.2014, 00:59, "Oleg Nesterov" <oleg@...hat.com>:
>  On 10/18, Kirill Tkhai wrote:
>>   18.10.2014, 01:40, "Oleg Nesterov" <oleg@...hat.com>:
>>>   ...
>>>   The
>>>   task_struct itself can't go away,
>>>   ...
>>>   --- a/kernel/sched/fair.c
>>>   +++ b/kernel/sched/fair.c
>>>   @@ -1158,7 +1158,13 @@ static void task_numa_compare(struct task_numa_env *env,
>>>
>>>            rcu_read_lock();
>>>            cur = ACCESS_ONCE(dst_rq->curr);
>>>   - if (cur->pid == 0) /* idle */
>>>   + /*
>>>   + * No need to move the exiting task, and this ensures that ->curr
>>>   + * wasn't reaped and thus get_task_struct() in task_numa_assign()
>>>   + * is safe; note that rcu_read_lock() can't protect from the final
>>>   + * put_task_struct() after the last schedule().
>>>   + */
>>>   + if (is_idle_task(cur) || (cur->flags & PF_EXITING))
>>>                    cur = NULL;
>>>
>>>            /*
>>   Oleg, I've looked once again, and now it's not good for me.
>  Ah. Thanks a lot Kirill for correcting me!
>
>  I was looking at this rcu_read_lock() and I didn't even try to think
>  what it can actually protect. Nothing.

<snip>

>  No, I don't think this can work. Let's look at the current code:
>
>          rcu_read_lock();
>          cur = ACCESS_ONCE(dst_rq->curr);
>          if (cur->pid == 0) /* idle */
>
>  And any dereference, even reading ->pid is not safe. This memory can be
>  freed, unmapped, reused, etc.
>
>  Looks like, task_numa_compare() needs to take dst_rq->lock and get the
>  refernce first.

Yeah, detection of idle is not save. If we reorder the checks almost all
problems will be gone. All except unmapping. JFI, is it possible with
such kernel structures as task_struct? I.e. do mem caches use high memory
in their bosoms?
Thanks!

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b78280c..114ec33 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1165,7 +1165,30 @@ static void task_numa_compare(struct task_numa_env *env,
 
 	rcu_read_lock();
 	cur = ACCESS_ONCE(dst_rq->curr);
-	if (cur->pid == 0) /* idle */
+	/*
+	 * No need to move the exiting task, and this ensures that ->curr
+	 * wasn't reaped and thus get_task_struct() in task_numa_assign()
+	 * is safe; note that rcu_read_lock() can't protect from the final
+	 * put_task_struct() after the last schedule().
+	 */
+	if (cur->flags & PF_EXITING)
+		cur = NULL;
+	smp_rmb(); /* Pairs with dst_rq->lock unlocking which implies smp_wmb */
+	/*
+	 * Check once again to be sure curr is still on dst_rq. Three situations
+	 * are possible here:
+	 * 1)cur has gone and been freed, and dst_rq->curr is pointing on other
+	 *   memory. In this case the check will fail;
+	 * 2)cur is pointing to a new task, which is using the memory of just
+	 *   freed cur (and it is new dst_rq->curr). It's OK, because we've
+	 *   locked RCU before the new task has been even created
+	 *   (so delayed_put_task_struct() hasn't been called);
+	 * 3)we've taken not exiting task (likely case). No need to worry.
+	 */
+	if (cur != ACCESS_ONCE(dst_rq->curr))
+		cur = NULL;
+
+	if (is_idle_task(cur))
 		cur = NULL;
 
 	/*


>  Or, perhaps, we need to change the rules to ensure that any "task_struct *"
>  pointer is rcu-safe. Perhaps we have more similar problems... I'd like to
>  avoid this if possible.

RT tree has:

https://git.kernel.org/cgit/linux/kernel/git/paulg/3.10-rt-patches.git/tree/patches/sched-delay-put-task.patch

But other problem was decided there..

>  Hmm. I'll try to think more.
>
>  Thanks!

Kirill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ