[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4C3CB7A5.7010406@us.ibm.com>
Date: Tue, 13 Jul 2010 11:59:49 -0700
From: Darren Hart <dvhltc@...ibm.com>
To: Thomas Gleixner <tglx@...utronix.de>
CC: "lkml, " <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...e.hu>,
Eric Dumazet <eric.dumazet@...il.com>,
John Kacur <jkacur@...hat.com>,
Steven Rostedt <rostedt@...dmis.org>,
Mike Galbraith <efault@....de>,
linux-rt-users <linux-rt-users@...r.kernel.org>
Subject: Re: [PATCH][RT] futex: protect against pi_blocked_on corruption during
requeue PI -V2
On 07/13/2010 04:52 AM, Thomas Gleixner wrote:
> On Tue, 13 Jul 2010, Thomas Gleixner wrote:
>>
>> This code causes braindamage. I really wonder whether we need to
>> remove it according to the "United Nations Convention against Torture
>> and Other Cruel, Inhuman or Degrading Treatment or Punishment".
>>
>
> Ok, finally managed to twist my brain around it. Mike, can you give it
> a test ride ?
Since Mike is out I built this version and ran it a few times. I saw a
100% reproduction rate previously. I haven't seen any errors in a
handful of runs (~10) with this patch.
I do still see the hrtimer latency message on the first run, so this is
likely unrelated to the issue we're addressing:
Jul 13 14:47:59 elm9m94 kernel: hrtimer: interrupt took 123924 ns
As for Thomas's changes, only a couple nitpics below:
> @@ -2255,18 +2265,51 @@ static int futex_wait_requeue_pi(u32 __u
> /* Queue the futex_q, drop the hb lock, wait for wakeup. */
> futex_wait_queue_me(hb,&q, to);
>
> - spin_lock(&hb->lock);
> - ret = handle_early_requeue_pi_wakeup(hb,&q,&key2, to);
> - spin_unlock(&hb->lock);
> - if (ret)
> - goto out_put_keys;
> + /*
> + * Avoid races with requeue and trying to block on two mutexes
> + * (hb->lock and uaddr2's rtmutex) by serializing access to
> + * pi_blocked_on with pi_lock.
> + */
> + raw_spin_lock_irq(¤t->pi_lock);
> + if (current->pi_blocked_on) {
> + /* Requeue happened already */
This comment isn't quite accurate. The requeue may be in progress, which
means the q.lock_ptr is not trustworthy as noted below. Consider:
/*
* We have been requeued, or are in the process
* of being requeued.
*/
> + raw_spin_unlock_irq(¤t->pi_lock);
> + } else {
> + /*
> + * Setting pi_blocked_on to PI_WAKEUP_INPROGRESS
> + * prevents a concurrent requeue from enqueuein us on
s/enqueuein/enqueueing/ not that my dictionary thinks either one of
them are words ;-)
--
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists