lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4E1F1763.6090508@candelatech.com>
Date:	Thu, 14 Jul 2011 09:20:51 -0700
From:	Ben Greear <greearb@...delatech.com>
To:	"Myklebust, Trond" <Trond.Myklebust@...app.com>
CC:	linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC] sunrpc:  Fix race between work-queue and rpc_killall_tasks.

On 07/12/2011 10:30 AM, Ben Greear wrote:
> On 07/12/2011 10:25 AM, Myklebust, Trond wrote:
>>> -----Original Message-----
>>> From: Ben Greear [mailto:greearb@...delatech.com]
>>> Sent: Tuesday, July 12, 2011 1:15 PM
>>> To: Myklebust, Trond
>>> Cc: linux-nfs@...r.kernel.org; linux-kernel@...r.kernel.org
>>> Subject: Re: [RFC] sunrpc: Fix race between work-queue and
>>> rpc_killall_tasks.
>>>

>>> I added lots of locking around the calldata, work-queue logic, and
>>> such, and
>>> still the problem persists w/out hitting any of the debug warnings or
>>> poisoned
>>> values I put in. It almost seems like tk_calldata is just assigned to
>>> two
>>> different tasks.
>>>
>>> While poking through the code, I noticed that 'map' is static in
>>> rpcb_getport_async.
>>>
>>> That would seem to cause problems if two threads called this method at
>>> the same time, possibly causing tk_calldata to be assigned to two
>>> different
>>> tasks???
>>>
>>> Any idea why it is static?
>>
>> Doh! That is clearly a typo dating all the way back to when Chuck
>> wrote that function.
>>
>> Yes, that would definitely explain your problem.
>
> Ok, patch sent. I assume someone will propagate this to stable
> as desired?
>
> And assuming this fixes it, can I get some brownie points towards
> review of the ip-addr binding patches? :)

Just to close this issue:  We ran a clean 24+ hour test mounting and
unmounting 200 mounts every 30 seconds, and it ran with zero problems.

This was with 2.6.38.8+ with this fix applied.

3.0-rc7+ is still flaky in various other ways, but I see no more
NFS problems at least.

So, that was the problem I was hitting, and it appears to be the
last problem in this area.

Thanks,
Ben

-- 
Ben Greear <greearb@...delatech.com>
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ