netdev - Re: [V9fs-developer] [PATCH] net: trans

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130725190506.GA32375@nautica>
Date:	Thu, 25 Jul 2013 21:05:06 +0200
From:	Dominique Martinet <dominique.martinet@....fr>
To:	Eric Van Hensbergen <ericvh@...il.com>
Cc:	Dominique Martinet <dominique.martinet@....fr>,
	Latchesar Ionkov <lucho@...kov.net>, pebolle@...cali.nl,
	netdev@...r.kernel.org,
	linux-kernel <linux-kernel@...r.kernel.org>, andi@...zian.org,
	rminnich@...dia.gov,
	V9FS Developers <v9fs-developer@...ts.sourceforge.net>,
	David Miller <davem@...emloft.net>
Subject: Re: [V9fs-developer] [PATCH] net: trans_rdma: remove unused
 function

Eric Van Hensbergen wrote on Thu, Jul 25, 2013 :
> So, the cancel function should be used to flush any pending requests that
> haven't actually been sent yet.  Looking at the 9p RDMA code, it looks like
> the thought was that this wasn't going to be possible.  Regardless of
> removing unsent requests, the flush will still be sent and if the server
> processes it before the original request and sends a flush response back
> then we need to clear the posted buffer.  This is what rdma_cancelled is
> supposed to be doing.  So, the fix is to hook it into the structure -- but
> looking at the code it seems like we probably need to do something more to
> reclaim the buffer rather than just incrementing a counter.
> 
> To be clear this has less to do with recovery and more to do with the
> proper implementation of 9p flush semantics.  By and large, those semantics
> won't impact static file system users -- but if anyone is using the
> transport to access synthetic filesystems or files then they'll definitely
> want to have a properly implemented flush setup.  The way to test this is
> to get a blocking read on a remote named pipe or fifo and then ^C it.

Ok, I knew about the concept of flush but didn't think a ^C would cause
a -ESYSRESTART, so didn't think of that.
That said, reading from, say, a fifo is an entierly local operation: the
client does a walk, getattr, doesn't do anything 9p-wise, and clunks
when it's done with it.



As for the function needing a bit more work, there's a race, but on
"normal" requests I think it is about right - the answer lays in a
comment in rdma_request:

  /* When an error occurs between posting the recv and the send,
   * there will be a receive context posted without a pending request.
   * Since there is no way to "un-post" it, we remember it and skip
   * post_recv() for the next request.
   * So here,
   * see if we are this `next request' and need to absorb an excess rc.
   * If yes, then drop and free our own, and do not recv_post().
   **/

Basically, receive buffers are sent in a queue, and we can't "retrieve"
it back, so we just don't sent next one.

There is one problem though - if the server handles the original request
before getting the flush, the receive buffer will be consumed and we
won't send a new one, so we'll starve the reception queue.
I'm afraid I don't have any bright idea there...


While we are on reception buffer issues, there is another problem with
the queue of receive buffers, even without flush, in the following
scenario:
 - post a buffer for tag 0, on a hanging request
 - post a buffer for tag 1
 - reply for tag 1 will come on buffer from tag 0
 - post another request with tag 1.. the buffer already is in the queue,
and we don't know we can post the buffer associated with tag 0 back.

I haven't found how to reproduce this perfectly yet, but a dd with
blocksize 1MB and one with blocksize 10B in parallel brought the
mountpoint down (and the whole server was completely unavailable for the
duration of the dd - TCP sessions timed out, I even got IO errors on the
local disk :D)


Regards,
-- 
Dominique Martinet
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html