[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4AD33A4D.4070006@us.ibm.com>
Date: Mon, 12 Oct 2009 07:16:45 -0700
From: Darren Hart <dvhltc@...ibm.com>
To: Jeremy Leibs <leibs@...lowgarage.com>
CC: Thomas Gleixner <tglx@...utronix.de>,
Blaise Gassend <blaise@...lowgarage.com>,
LKML <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: ERESTARTSYS escaping from sem_wait with RTLinux patch
Jeremy Leibs wrote:
> On Sat, Oct 10, 2009 at 10:59 AM, Thomas Gleixner <tglx@...utronix.de> wrote:
>> Blaise,
>>
>> On Sat, 10 Oct 2009, Blaise Gassend wrote:
>>> 1) Where is the ERESTARTSYS being prevented from getting to user space?
>>>
>>> The only likely place I see for preventing ERESTARTSYS from escaping to
>>> user space is in arch/*/kernel/signal*.c. However, I don't see how the
>>> code there is being called if there no signal pending. Is that a path
>>> for ERESTARTSYS to escape from the kernel?
>>>
>>> The following comment in kernel/futex.h in futex_wait makes me wonder if
>>> two threads are getting marked as ERESTARTSYS. The first one to leave
>>> the kernel processes the signal and restarts. The second one doesn't
>>> have a signal to handle, so it returns to user space without getting
>>> into signal*.c and wreaks havoc.
>>>
>>> (...)
>>> /*
>>> * We expect signal_pending(current), but another thread may
>>> * have handled it for us already.
>>> */
>>> if (!abs_time)
>>> return -ERESTARTSYS;
>>> (...)
>> If the task is woken by a signal, then the task private flag
>> TIF_SIGPENDING is set, but in case of a process wide signal the signal
>> might have been handled by another thread of the same process before
>> that thread reaches the signal handling code, but then ERESTARTSYS is
>> handled gracefully. So you seem to trigger a code path which does not
>> go through do_signal.
>>
>>> 2) Why would this be happening only with RT kernels?
>> Slightly different timing and locking semantics.
>>
>>> 3) Any suggestions on the best place to patch/workaround this?
>>>
>>> My understanding is that if I was to treat ERESTARTSYS as an EAGAIN,
>>> most applications would be perfectly happy. Would bad things happen if I
>>> replaced the ERESTARTSYS in futex_wait with an EAGAIN?
>> No workarounds please. We really want to know what's wrong.
>>
>> Two things to look at:
>>
>> 1) Does that happen with 2.6.31.2-rt13 as well ?
>>
>> 2) Add a check to the code path where ERESTARTSYS is returned:
>>
>> if (!signal_pending(current))
>> printk(KERN_ERR ".....");
>>
>
> Ok, in 2.6.31.2-rt13, I modified futex.c as:
> -----
> /*
> * We expect signal_pending(current), but another thread may
> * have handled it for us already.
> */
> ret = -ERESTARTSYS;
> if (!abs_time)
> {
> if (!signal_pending(current))
> printk(KERN_ERR ".....");
> goto out_put_key;
> }
> -----
>
> Then when I cause the crash:
>
> leibs@c1:~$ python threadprocs8.py
> sem_wait: Unknown error 512
> Segmentation fault
>
> dmesg shows me the corresponding:
> [ 82.232999] .....
> [ 82.233177] python[2834]: segfault at 48 ip 00000000004b0177 sp
> 00007f9429788ad8 error 4 in python2.6[400000+216000]
OK, so I suspect one of two things.
1) Recent changes to futex.c have somehow created a wakeup race and
unqueue_me() doesn't detect it was woken with FUTEX_WAKE, then falls
out through the ERESTARTSYS path.
2) Recent changes have exposed an existing race in unqueue_me().
I'll do some runs on my 8-way systems and see if I can:
o Identify the guilty patch
o Identify the race in question
Thanks for the test case! Now... why is sem_wait() being used in a timer
call....
--
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists