[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4CBC42C2.4040604@kerlabs.com>
Date: Mon, 18 Oct 2010 14:51:14 +0200
From: Matthieu Fertré <matthieu.fertre@...labs.com>
To: Darren Hart <dvhart@...ux.intel.com>
CC: Louis Rilling <louis.rilling@...labs.com>,
linux-kernel@...r.kernel.org,
Rusty Russell <rusty@...tcorp.com.au>,
Ingo Molnar <mingo@...e.hu>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [RESEND PATCH] futex: fix key reference counter in case of requeue.
Hi Darren,
Le 15/10/2010 21:13, Darren Hart a écrit :
> On 10/14/2010 04:30 AM, Louis Rilling wrote:
>> From: Matthieu Fertré<matthieu.fertre@...labs.com>
>
> Hi Matthew,
>
>>
>> This patch ensures that we are referring to the right key when dropping
>> reference for the futex_wait operation.
>>
>> The following scenario explains a typical case where the bug was
>> happening:
>>
>> Process P calls futex_wait() on futex identified by 'key1'. 2 references
>> are taken on this key: one for the struct futex_q itself, and one for the
>> futex_wait operation.
>> If now, process P is requeued on a futex identified by 'key2', its
>> futex_q->key is updated from 'key1' to 'key2' and a reference is got
>> to 'key2' and one is dropped to 'key1'.
>> Later, another process calls futex_wake(): it gets a reference to
>> 'key2', wakes process P, and drops reference to 'key2'.
>> Once process P is woken up, it should unqueue, drop reference to 'key2'
>> (the one referring to the futex_q, this is done in unqueue_me())
>> and to 'key1' (the one referring to futex_wait operation). Without this
>> patch it drops reference to 'key2' instead of 'key1'.
>
> Nice catch. How did this manifest itself? Did you catch it just by code
> inspection?
I found it while testing the distributed implementation of futex in
Kerrighed (www.kerrighed.org).
After deeply looking, I noticed that the bug comes from vanilla linux
kernel 2.6.30 (the one on which current version of Kerrighed is based).
Then I checked if the bug still existed in latest linux rc or if there
were some bugfixes.
I have attached the test that reveals the bug on my system. The test
runs some basic wait/wake/requeue scenario on futex "hosted" in a sysv
shared memory segment. It is composed of one executable and 2 scripts
that are to be used with LTP. To run it without LPT, you can replace
calls to tst_resm/tst_brkm with echo in the shell scripts.
Without debugging facilities, it may BUG while destroying the shared
segment. As far as I remember, with some kernel hacking features
enabled, it was complaining in the kernel log, but there was no crash
and I don't remember exactly about what it complains.
>
> I've been trying to develop a futex test suite to catch issues with the
> futex implementation, as well as to test any changes made to avoid
> regressions. Mind having a look?
>
> http://git.kernel.org/?p=linux/kernel/git/dvhart/futextest.git;a=summary
I had already a git checkout for this futex test suite :)
It was not fitting to my tests since I was checking behavior of
distributed futex inside Kerrighed. (I need test done by separate
processes spreaded on different nodes accessing the futex through sysv
shared segments).
Regards,
Matthieu
>
>> Signed-off-by: Matthieu Fertré<matthieu.fertre@...labs.com>
>> Signed-off-by: Louis Rilling<louis.rilling@...labs.com>
>> ---
>> kernel/futex.c | 8 ++++++--
>> 1 files changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/futex.c b/kernel/futex.c
>> index 6a3a5fa..bed6717 100644
>> --- a/kernel/futex.c
>> +++ b/kernel/futex.c
>> @@ -1791,6 +1791,7 @@ static int futex_wait(u32 __user *uaddr, int
>> fshared,
>> struct restart_block *restart;
>> struct futex_hash_bucket *hb;
>> struct futex_q q;
>> + union futex_key key;
>
> We should be able to do this properly without requiring an additional
> key variable. I think tglx has proposed a suitable fix - but it needs
> testing to avoid any subtle regressions.
>
View attachment "futex-shm-tool.c" of type "text/x-csrc" (4890 bytes)
Download attachment "futex_shm01.sh" of type "application/x-shellscript" (3373 bytes)
Download attachment "lib_futex.sh" of type "application/x-shellscript" (3044 bytes)
Powered by blists - more mailing lists