netdev - Re: [PATCH net-next v2 2/8] tipc: compensate for double accounting in socket rcv buffer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1400068021.7973.78.camel@edumazet-glaptop2.roam.corp.google.com>
Date:	Wed, 14 May 2014 04:47:01 -0700
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Jon Maloy <jon.maloy@...csson.com>
Cc:	davem@...emloft.net, netdev@...r.kernel.org,
	Paul Gortmaker <paul.gortmaker@...driver.com>,
	erik.hugne@...csson.com, ying.xue@...driver.com, maloy@...jonn.com,
	tipc-discussion@...ts.sourceforge.net
Subject: Re: [PATCH net-next v2 2/8] tipc: compensate for double accounting
 in socket rcv buffer

On Wed, 2014-05-14 at 05:39 -0400, Jon Maloy wrote:
> The function net/core/sock.c::__release_sock() runs a tight loop
> to move buffers from the socket backlog queue to the receive queue.
> 
> As a security measure, sk_backlog.len of the receiving socket
> is not set to zero until after the loop is finished, i.e., until
> the whole backlog queue has been transferred to the receive queue.
> During this transfer, the data that has already been moved is counted
> both in the backlog queue and the receive queue, hence giving an
> incorrect picture of the available queue space for new arriving buffers.
> 
> This leads to unnecessary rejection of buffers by sk_add_backlog(),
> which in TIPC leads to unnecessarily broken connections.
> 
> In this commit, we compensate for this double accounting by adding
> a counter that keeps track of it. The function socket.c::backlog_rcv()
> receives buffers one by one from __release_sock(), and adds them to the
> socket receive queue. If the transfer is successful, it increases a new
> atomic counter 'tipc_sock::dupl_rcvcnt' with 'truesize' of the
> transferred buffer. If a new buffer arrives during this transfer and
> finds the socket busy (owned), we attempt to add it to the backlog.
> However, when sk_add_backlog() is called, we adjust the 'limit'
> parameter with the value of the new counter, so that the risk of
> inadvertent rejection is eliminated.
> 
> It should be noted that this change does not invalidate the original
> purpose of zeroing 'sk_backlog.len' after the full transfer. We set an
> upper limit for dupl_rcvcnt, so that if a 'wild' sender (i.e., one that
> doesn't respect the send window) keeps pumping in buffers to
> sk_add_backlog(), he will eventually reach an upper limit,
> (2 x TIPC_CONN_OVERLOAD_LIMIT). After that, no messages can be added
> to the backlog, and the connection will be broken. Ordinary, well-
> behaved senders will never reach this buffer limit at all.

This looks very complicated and hides some underlying problem.

Why TCP has no problem, but TIPC has it ?

It seems you have too low sk_rcvbuf, or you advertise too big windows to
senders.

If backlog content is so big it eventually makes TIPC drops incoming
packets, then you have a scheduling issue, or unexpected bufferbloat.

Maybe TIPC holds socket lock too long in some place ?



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html