linux-kernel - Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4950FA60.3060405@opengridcomputing.com>
Date:	Tue, 23 Dec 2008 08:49:04 -0600
From:	Tom Tucker <tom@...ngridcomputing.com>
To:	Trond Myklebust <Trond.Myklebust@...app.com>
CC:	Ian Campbell <ijc@...lion.org.uk>, linux-nfs@...r.kernel.org,
	Max Kellermann <mk@...all.com>, linux-kernel@...r.kernel.org,
	gcosta@...hat.com, Grant Coady <grant_lkml@...o.com.au>,
	"J. Bruce Fields" <bfields@...ldses.org>
Subject: Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue
 'XPT_DEAD' transports

Trond Myklebust wrote:
> On Wed, 2008-12-17 at 09:35 -0600, Tom Tucker wrote:
>   
>> Trond Myklebust wrote:
>>     
>>> Aside from being racy (there is nothing preventing someone setting XPT_DEAD
>>> after the test in svc_xprt_enqueue, and before XPT_BUSY is set), it is
>>> wrong to assume that transports which have called svc_delete_xprt() might
>>> not need to be re-enqueued.
>>>       
>> This is only true because now you allow transports with XPT_DEAD set to 
>> be enqueued -- yes?
>>
>>     
>>> See the list of deferred requests, which is currently never going to
>>> be cleared if the revisit call happens after svc_delete_xprt(). In this
>>> case, the deferred request will currently keep a reference to the transport
>>> forever.
>>>
>>>       
>> I agree this is a possibility and it needs to be fixed. I'm concerned 
>> that the root cause is still there though. I thought the test case was 
>> the client side timing out the connection. Why are there deferred 
>> requests sitting on what is presumably an idle connection?
>>     
>
> I haven't said that they are the cause of this test case. I've said that
> deferred requests hold references to the socket that can obviously
> deadlock. That needs to be fixed regardless of whether or not it is the
> cause here.
>
> There are plenty of situations in which the client may choose to close
> the TCP socket even if there are outstanding requests. One of the most
> common is when the user signals the process, so that an RPC call that
> was partially transmitted (ran out of buffer space) gets cancelled
> before it can finish transmitting. In that case the client has no choice
> but to disconnect and immediately reconnect.
>
>   
>>> The fix should be to allow dead transports to be enqueued in order to clear
>>> the deferred requests, then change the order of processing in svc_recv() so
>>> that we pick up deferred requests before we do the XPT_CLOSE processing.
>>>
>>>       
>> Wouldn't it be simpler to clean up any deferred requests in the close 
>> path instead of changing the meaning of XPT_DEAD and dispatching 
>> N-threads to do the same?
>>     
>
> AFAICS, deferred requests are the property of the cache until they
> expire or a downcall occurs. I'm not aware of any way to cancel only
> those deferred requests that hold a reference to this particular
> transport.
>
>   
Ok, I think you're right, and I think that this fix is correct and makes 
the symptom go away.

I may be completely confused here, but:

- The deferred requests should be getting cleaned up by timing out, and 
that does not not seem to be happening, (Is this true?)

- By releasing the underlying connection prior to releasing the 
transport that manages it, we've converted the visible resource leek to 
an invisible one.

- This has been around forever and changing the client side close 
behavior graceful exposed this bug,

So I'm wondering if what we want to do here is to provide a mechanism 
for canceling deferred requests for a particular transport. This would 
provide a mechanism for the generic transport driver to force 
cancellation of deferred requests when closing.

Tom


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/