[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220121011941.1123392-1-kuba@kernel.org>
Date: Thu, 20 Jan 2022 17:19:41 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: davem@...emloft.net, edumazet@...gle.com
Cc: dsahern@...il.com, pabeni@...hat.com, herbert@...dor.apana.org.au,
netdev@...r.kernel.org, Jakub Kicinski <kuba@...nel.org>
Subject: [PATCH net] ipv6: gro: flush instead of assuming different flows on hop_limit mismatch
IPv6 GRO considers packets to belong to different flows when their
hop_limit is different. This seems counter-intuitive, the flow is
the same. hop_limit may vary because of various bugs or hacks but
that doesn't mean it's okay for GRO to reorder packets.
Practical impact of this problem on overall TCP performance
is unclear, but TCP itself detects this reordering and bumps
TCPSACKReorder resulting in user complaints.
Note that the code plays an easy to miss trick by upcasting next_hdr
to a u16 pointer and compares next_hdr and hop_limit in one go.
Coalesce the flush setting to reduce the instruction count a touch.
Fixes: 787e92083601 ("ipv6: Add GRO support")
Signed-off-by: Jakub Kicinski <kuba@...nel.org>
---
net/ipv6/ip6_offload.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index b29e9ba5e113..570071a87e71 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -249,7 +249,7 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
if ((first_word & htonl(0xF00FFFFF)) ||
!ipv6_addr_equal(&iph->saddr, &iph2->saddr) ||
!ipv6_addr_equal(&iph->daddr, &iph2->daddr) ||
- *(u16 *)&iph->nexthdr != *(u16 *)&iph2->nexthdr) {
+ iph->nexthdr != iph2->nexthdr) {
not_same_flow:
NAPI_GRO_CB(p)->same_flow = 0;
continue;
@@ -260,8 +260,9 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
goto not_same_flow;
}
/* flush if Traffic Class fields are different */
- NAPI_GRO_CB(p)->flush |= !!(first_word & htonl(0x0FF00000));
- NAPI_GRO_CB(p)->flush |= flush;
+ NAPI_GRO_CB(p)->flush |= flush |
+ !!((first_word & htonl(0x0FF00000)) |
+ (iph->hop_limit ^ iph2->hop_limit));
/* If the previous IP ID value was based on an atomic
* datagram we can overwrite the value and ignore it.
--
2.31.1
Powered by blists - more mailing lists