[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1343668602.2667.6.camel@bwh-desktop.uk.solarflarecom.com>
Date: Mon, 30 Jul 2012 18:16:42 +0100
From: Ben Hutchings <bhutchings@...arflare.com>
To: David Miller <davem@...emloft.net>
CC: <netdev@...r.kernel.org>, <linux-net-drivers@...arflare.com>
Subject: [PATCH net 1/2] tcp: Limit number of segments generated by GSO per
skb
A peer (or local user) may cause TCP to use a nominal MSS of as little
as 88 (actual MSS of 76 with timestamps). Given that we have a
sufficiently prodigious local sender and the peer ACKs quickly enough,
it is nevertheless possible to grow the window for such a connection
to the point that we will try to send just under 64K at once. This
results in a single skb that expands to 861 segments.
In some drivers with TSO support, such an skb will require hundreds of
DMA descriptors; a substantial fraction of a TX ring or even more than
a full ring. The TX queue selected for the skb may stall and trigger
the TX watchdog repeatedly (since the problem skb will be retried
after the TX reset). This particularly affects sfc, for which the
issue is designated as CVE-2012-3412. However it may be that some
hardware or firmware also fails to handle such an extreme TSO request
correctly.
Therefore, limit the number of segments per skb to 100. This should
make no difference to behaviour unless the actual MSS is less than
about 700.
Signed-off-by: Ben Hutchings <bhutchings@...arflare.com>
---
include/net/tcp.h | 3 +++
net/ipv4/tcp.c | 4 +++-
net/ipv4/tcp_output.c | 17 +++++++++--------
3 files changed, 15 insertions(+), 9 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index e19124b..098a2d0 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -70,6 +70,9 @@ extern void tcp_time_wait(struct sock *sk, int state, int timeo);
/* The least MTU to use for probing */
#define TCP_BASE_MSS 512
+/* Maximum number of segments we may require GSO to generate from an skb. */
+#define TCP_MAX_GSO_SEGS 100
+
/* After receiving this amount of duplicate ACKs fast retransmit starts. */
#define TCP_FASTRETRANS_THRESH 3
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index e7e6eea..51d8daf 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -811,7 +811,9 @@ static unsigned int tcp_xmit_size_goal(struct sock *sk, u32 mss_now,
old_size_goal + mss_now > xmit_size_goal)) {
xmit_size_goal = old_size_goal;
} else {
- tp->xmit_size_goal_segs = xmit_size_goal / mss_now;
+ tp->xmit_size_goal_segs =
+ min_t(u32, xmit_size_goal / mss_now,
+ TCP_MAX_GSO_SEGS);
xmit_size_goal = tp->xmit_size_goal_segs * mss_now;
}
}
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 33cd065..c86c288 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1522,21 +1522,21 @@ static void tcp_cwnd_validate(struct sock *sk)
* when we would be allowed to send the split-due-to-Nagle skb fully.
*/
static unsigned int tcp_mss_split_point(const struct sock *sk, const struct sk_buff *skb,
- unsigned int mss_now, unsigned int cwnd)
+ unsigned int mss_now, unsigned int max_segs)
{
const struct tcp_sock *tp = tcp_sk(sk);
- u32 needed, window, cwnd_len;
+ u32 needed, window, max_len;
window = tcp_wnd_end(tp) - TCP_SKB_CB(skb)->seq;
- cwnd_len = mss_now * cwnd;
+ max_len = mss_now * max_segs;
- if (likely(cwnd_len <= window && skb != tcp_write_queue_tail(sk)))
- return cwnd_len;
+ if (likely(max_len <= window && skb != tcp_write_queue_tail(sk)))
+ return max_len;
needed = min(skb->len, window);
- if (cwnd_len <= needed)
- return cwnd_len;
+ if (max_len <= needed)
+ return max_len;
return needed - needed % mss_now;
}
@@ -1999,7 +1999,8 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
limit = mss_now;
if (tso_segs > 1 && !tcp_urg_mode(tp))
limit = tcp_mss_split_point(sk, skb, mss_now,
- cwnd_quota);
+ min(cwnd_quota,
+ TCP_MAX_GSO_SEGS));
if (skb->len > limit &&
unlikely(tso_fragment(sk, skb, limit, mss_now, gfp)))
--
1.7.7.6
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists