[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20221218234801.579114-1-jmaxwell37@gmail.com>
Date: Mon, 19 Dec 2022 10:48:01 +1100
From: Jon Maxwell <jmaxwell37@...il.com>
To: davem@...emloft.net
Cc: edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
yoshfuji@...ux-ipv6.org, dsahern@...nel.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
Jon Maxwell <jmaxwell37@...il.com>
Subject: [net-next] ipv6: fix routing cache overflow for raw sockets
Sending Ipv6 packets in a loop via a raw socket triggers an issue where a
route is cloned by ip6_rt_cache_alloc() for each packet sent. This quickly
consumes the Ipv6 max_size threshold which defaults to 4096 resulting in
these warnings:
[1] 99.187805] dst_alloc: 7728 callbacks suppressed
[2] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
.
.
[300] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
When this happens the packet is dropped and sendto() gets a network is
unreachable error:
# ./a.out -s
remaining pkt 200557 errno 101
remaining pkt 196462 errno 101
.
.
remaining pkt 126821 errno 101
Fix this by adding a flag to prevent the cloning of routes for raw sockets.
Which makes the Ipv6 routing code use per-cpu routes instead which prevents
packet drop due to max_size overflow.
Ipv4 is not affected because it has a very large default max_size.
Signed-off-by: Jon Maxwell <jmaxwell37@...il.com>
---
include/net/flow.h | 1 +
net/ipv6/raw.c | 2 +-
net/ipv6/route.c | 1 +
3 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/include/net/flow.h b/include/net/flow.h
index 2f0da4f0318b..30b8973ffb4b 100644
--- a/include/net/flow.h
+++ b/include/net/flow.h
@@ -37,6 +37,7 @@ struct flowi_common {
__u8 flowic_flags;
#define FLOWI_FLAG_ANYSRC 0x01
#define FLOWI_FLAG_KNOWN_NH 0x02
+#define FLOWI_FLAG_SKIP_RAW 0x04
__u32 flowic_secid;
kuid_t flowic_uid;
struct flowi_tunnel flowic_tun_key;
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index a06a9f847db5..0b89a7e66d09 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -884,7 +884,7 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
security_sk_classify_flow(sk, flowi6_to_flowi_common(&fl6));
if (hdrincl)
- fl6.flowi6_flags |= FLOWI_FLAG_KNOWN_NH;
+ fl6.flowi6_flags |= FLOWI_FLAG_KNOWN_NH | FLOWI_FLAG_SKIP_RAW;
if (ipc6.tclass < 0)
ipc6.tclass = np->tclass;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index e74e0361fd92..beae0bd61738 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2226,6 +2226,7 @@ struct rt6_info *ip6_pol_route(struct net *net, struct fib6_table *table,
if (rt) {
goto out;
} else if (unlikely((fl6->flowi6_flags & FLOWI_FLAG_KNOWN_NH) &&
+ !(fl6->flowi6_flags & FLOWI_FLAG_SKIP_RAW) &&
!res.nh->fib_nh_gw_family)) {
/* Create a RTF_CACHE clone which will not be
* owned by the fib6 tree. It is for the special case where
--
2.31.1
Powered by blists - more mailing lists