[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1575935566-18786-2-git-send-email-jon.maloy@ericsson.com>
Date: Tue, 10 Dec 2019 00:52:44 +0100
From: Jon Maloy <jon.maloy@...csson.com>
To: <davem@...emloft.net>, <netdev@...r.kernel.org>
CC: <tung.q.nguyen@...tech.com.au>, <hoang.h.le@...tech.com.au>,
<jon.maloy@...csson.com>, <lxin@...hat.com>, <shuali@...hat.com>,
<ying.xue@...driver.com>, <edumazet@...gle.com>,
<tipc-discussion@...ts.sourceforge.net>
Subject: [net-next 1/3] tipc: eliminate gap indicator from ACK messages
When we increase the link send window we sometimes observe the
following scenario:
1) A packet #N arrives out of order far ahead of a sequence of older
packets which are still under way. The packet is added to the
deferred queue.
2) The missing packets arrive in sequence, and for each 16th of them
an ACK is sent back to the receiver, as it should be.
3) When building those ACK messages, it is checked if there is a gap
between the link's 'rcv_nxt' and the first packet in the deferred
queue. This is always the case until packet number #N-1 arrives, and
a 'gap' indicator is added, effectively turning them into NACK
messages.
4) When those NACKs arrive at the sender, all the requested
retransmissions are done, since it is a first-time request.
This sometimes leads to a huge amount of redundant retransmissions,
causing a drop in max throughput. This problem gets worse when we
in a later commit introduce variable window congestion control,
since it drops the link back to 'fast recovery' much more often
than necessary.
We now fix this by not sending any 'gap' indicator in regular ACK
messages. We already have a mechanism for sending explicit NACKs
in place, and this is sufficient to keep up the packet flow.
Acked-by: Ying Xue <ying.xue@...driver.com>
Signed-off-by: Jon Maloy <jon.maloy@...csson.com>
---
net/tipc/link.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/net/tipc/link.c b/net/tipc/link.c
index 24d4d10..6d86446 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1521,7 +1521,8 @@ static int tipc_link_build_nack_msg(struct tipc_link *l,
struct sk_buff_head *xmitq)
{
u32 def_cnt = ++l->stats.deferred_recv;
- u32 defq_len = skb_queue_len(&l->deferdq);
+ struct sk_buff_head *dfq = &l->deferdq;
+ u32 defq_len = skb_queue_len(dfq);
int match1, match2;
if (link_is_bc_rcvlink(l)) {
@@ -1532,8 +1533,12 @@ static int tipc_link_build_nack_msg(struct tipc_link *l,
return 0;
}
- if (defq_len >= 3 && !((defq_len - 3) % 16))
- tipc_link_build_proto_msg(l, STATE_MSG, 0, 0, 0, 0, 0, xmitq);
+ if (defq_len >= 3 && !((defq_len - 3) % 16)) {
+ u16 rcvgap = buf_seqno(skb_peek(dfq)) - l->rcv_nxt;
+
+ tipc_link_build_proto_msg(l, STATE_MSG, 0, 0,
+ rcvgap, 0, 0, xmitq);
+ }
return 0;
}
@@ -1631,7 +1636,7 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe,
if (!tipc_link_is_up(l) && (mtyp == STATE_MSG))
return;
- if (!skb_queue_empty(dfq))
+ if ((probe || probe_reply) && !skb_queue_empty(dfq))
rcvgap = buf_seqno(skb_peek(dfq)) - l->rcv_nxt;
skb = tipc_msg_create(LINK_PROTOCOL, mtyp, INT_H_SIZE,
@@ -2079,7 +2084,6 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb,
if (rcvgap || reply)
tipc_link_build_proto_msg(l, STATE_MSG, 0, reply,
rcvgap, 0, 0, xmitq);
-
rc |= tipc_link_advance_transmq(l, ack, gap, ga, xmitq);
/* If NACK, retransmit will now start at right position */
--
2.1.4
Powered by blists - more mailing lists