netdev - Re: [V9fs-developer] [PATCH] net: trans

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20130726070138.GA28421@nautica>
Date:	Fri, 26 Jul 2013 09:01:38 +0200
From:	Dominique Martinet <dominique.martinet@....fr>
To:	Eric Van Hensbergen <ericvh@...il.com>
Cc:	Latchesar Ionkov <lucho@...kov.net>, pebolle@...cali.nl,
	netdev@...r.kernel.org,
	linux-kernel <linux-kernel@...r.kernel.org>, andi@...zian.org,
	rminnich@...dia.gov,
	V9FS Developers <v9fs-developer@...ts.sourceforge.net>,
	David Miller <davem@...emloft.net>
Subject: Re: [V9fs-developer] [PATCH] net: trans_rdma: remove unused
 function

I think I need to stop sending mails before triple-checking things!
So sorry for the multiple mails again.

Dominique Martinet wrote on Thu, Jul 25, 2013 :
> [rdma_cancelled]
> There is one problem though - if the server handles the original request
> before getting the flush, the receive buffer will be consumed and we
> won't send a new one, so we'll starve the reception queue.
> I'm afraid I don't have any bright idea there...

This still looks correct to me.


> While we are on reception buffer issues, there is another problem with
> the queue of receive buffers, even without flush, in the following
> scenario:
>  - post a buffer for tag 0, on a hanging request
>  - post a buffer for tag 1
>  - reply for tag 1 will come on buffer from tag 0
>  - post another request with tag 1.. the buffer already is in the queue,
> and we don't know we can post the buffer associated with tag 0 back.

It actually looks like the reply buffers are swapped properly - taken
out of the req struct into the context on send, then given back to the
appropriate req on reception, so on normal operation there's no problem
with what I described - sorry for crying wolf.
 
> I haven't found how to reproduce this perfectly yet, but a dd with
> blocksize 1MB and one with blocksize 10B in parallel brought the
> mountpoint down (and the whole server was completely unavailable for the
> duration of the dd - TCP sessions timed out, I even got IO errors on the
> local disk :D)

I need to run more tests to explain what happens with the two dds, but
it's easily reproductible with debugs on, I guess that helps with a race
somewhere.

Regards,
-- 
Dominique Martinet
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html