[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4E1C84B2.2020807@candelatech.com>
Date: Tue, 12 Jul 2011 10:30:26 -0700
From: Ben Greear <greearb@...delatech.com>
To: "Myklebust, Trond" <Trond.Myklebust@...app.com>
CC: linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC] sunrpc: Fix race between work-queue and rpc_killall_tasks.
On 07/12/2011 10:25 AM, Myklebust, Trond wrote:
>> -----Original Message-----
>> From: Ben Greear [mailto:greearb@...delatech.com]
>> Sent: Tuesday, July 12, 2011 1:15 PM
>> To: Myklebust, Trond
>> Cc: linux-nfs@...r.kernel.org; linux-kernel@...r.kernel.org
>> Subject: Re: [RFC] sunrpc: Fix race between work-queue and
>> rpc_killall_tasks.
>>
>> On 07/08/2011 03:14 PM, Myklebust, Trond wrote:
>>
>>>> [<ffffffff81105907>] print_trailer+0x131/0x13a
>>>> [<ffffffff81105945>] object_err+0x35/0x3e
>>>> [<ffffffff811077b3>] verify_mem_not_deleted+0x7a/0xb7
>>>> [<ffffffffa02891e5>] rpcb_getport_done+0x23/0x126 [sunrpc]
>>>> [<ffffffffa02810df>] rpc_exit_task+0x3f/0x6d [sunrpc]
>>>> [<ffffffffa02814d8>] __rpc_execute+0x80/0x253 [sunrpc]
>>>> [<ffffffffa02816ed>] ? rpc_execute+0x42/0x42 [sunrpc]
>>>> [<ffffffffa02816fd>] rpc_async_schedule+0x10/0x12 [sunrpc]
>>>> [<ffffffff81061343>] process_one_work+0x230/0x41d
>>>> [<ffffffff8106128e>] ? process_one_work+0x17b/0x41d
>>>> [<ffffffff8106379f>] worker_thread+0x133/0x217
>>>> [<ffffffff8106366c>] ? manage_workers+0x191/0x191
>>>> [<ffffffff81066f9c>] kthread+0x7d/0x85
>>>> [<ffffffff81485ee4>] kernel_thread_helper+0x4/0x10
>>>> [<ffffffff8147f0d8>] ? retint_restore_args+0x13/0x13
>>>> [<ffffffff81066f1f>] ? __init_kthread_worker+0x56/0x56
>>>> [<ffffffff81485ee0>] ? gs_change+0x13/0x13
>>>
>>> The calldata gets freed in the rpc_final_put_task() which shouldn't
>> ever be run while the task is still referenced in __rpc_execute
>>>
>>> IOW: it should be impossible to call rpc_exit_task() after
>> rpc_final_put_task
>>
>> I added lots of locking around the calldata, work-queue logic, and
>> such, and
>> still the problem persists w/out hitting any of the debug warnings or
>> poisoned
>> values I put in. It almost seems like tk_calldata is just assigned to
>> two
>> different tasks.
>>
>> While poking through the code, I noticed that 'map' is static in
>> rpcb_getport_async.
>>
>> That would seem to cause problems if two threads called this method at
>> the same time, possibly causing tk_calldata to be assigned to two
>> different
>> tasks???
>>
>> Any idea why it is static?
>
> Doh! That is clearly a typo dating all the way back to when Chuck wrote that function.
>
> Yes, that would definitely explain your problem.
Ok, patch sent. I assume someone will propagate this to stable
as desired?
And assuming this fixes it, can I get some brownie points towards
review of the ip-addr binding patches? :)
Thanks,
Ben
--
Ben Greear <greearb@...delatech.com>
Candela Technologies Inc http://www.candelatech.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists