[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1334698722.2472.71.camel@edumazet-glaptop>
Date: Tue, 17 Apr 2012 23:38:42 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Alexander Duyck <alexander.h.duyck@...el.com>
Cc: jeffrey.t.kirsher@...el.com,
"Skidmore, Donald C" <donald.c.skidmore@...el.com>,
Greg Rose <gregory.v.rose@...el.com>,
John Fastabend <john.r.fastabend@...el.com>,
Jesse Brandeburg <jesse.brandeburg@...el.com>,
netdev <netdev@...r.kernel.org>
Subject: TSO not 10G friendly if peer is close enough
After further analysis, I found we hit badly page refcounts games,
because when we transmit full size skb (64 KB), we can receive ACK for
the first MSS of the frame while skb was not completely sent by NIC.
(Needs 52 us to send a full TSO frame at 10Gb, and maybe NIC delays
interrupt to trigger TX completion ?)
In this case, tcp_trim_head() has to call pskb_expand_head(), because
skb clone is still alive in TX ring buffer.
pskb_expand_head() is really expensive, it has to make about 32 atomic
operations on page refcounts.
Hmm... maybe tcp_trim_head should not trim but only update an offset in
skb... With some luck, offset can reach skb->len when all data is
ACKnowledged...
Only in case of retransmit we would need to really trim the skb, and by
this time, clone would had been freed to : No more pskb_expand_head()
calls.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists