[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190515004610.102519-1-tracywwnj@gmail.com>
Date: Tue, 14 May 2019 17:46:10 -0700
From: Wei Wang <tracywwnj@...il.com>
To: David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Cc: Martin KaFai Lau <kafai@...com>, Wei Wang <weiwan@...gle.com>,
Mikael Magnusson <mikael.kernel@...ts.m7n.se>,
David Ahern <dsahern@...il.com>,
Eric Dumazet <edumazet@...gle.com>
Subject: [PATCH net] ipv6: fix src addr routing with the exception table
From: Wei Wang <weiwan@...gle.com>
When inserting route cache into the exception table, the key is
generated with both src_addr and dest_addr with src addr routing.
However, current logic always assumes the src_addr used to generate the
key is a /128 host address. This is not true in the following scenarios:
1. When the route is a gateway route or does not have next hop.
(rt6_is_gw_or_nonexthop() == false)
2. When calling ip6_rt_cache_alloc(), saddr is passed in as NULL.
This means, when looking for a route cache in the exception table, we
have to do the lookup twice: first time with the passed in /128 host
address, second time with the src_addr stored in fib6_info.
This solves the pmtu discovery issue reported by Mikael Magnusson where
a route cache with a lower mtu info is created for a gateway route with
src addr. However, the lookup code is not able to find this route cache.
Fixes: 2b760fcf5cfb ("ipv6: hook up exception table to store dst cache")
Reported-by: Mikael Magnusson <mikael.kernel@...ts.m7n.se>
Bisected-by: David Ahern <dsahern@...il.com>
Signed-off-by: Wei Wang <weiwan@...gle.com>
Acked-by: Eric Dumazet <edumazet@...gle.com>
---
net/ipv6/route.c | 33 ++++++++++++++++++++++++++++-----
1 file changed, 28 insertions(+), 5 deletions(-)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 23a20d62daac..c36900a07a78 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1574,23 +1574,36 @@ static struct rt6_info *rt6_find_cached_rt(const struct fib6_result *res,
struct rt6_exception *rt6_ex;
struct rt6_info *ret = NULL;
- bucket = rcu_dereference(res->f6i->rt6i_exception_bucket);
-
#ifdef CONFIG_IPV6_SUBTREES
/* fib6i_src.plen != 0 indicates f6i is in subtree
* and exception table is indexed by a hash of
* both fib6_dst and fib6_src.
- * Otherwise, the exception table is indexed by
- * a hash of only fib6_dst.
+ * However, the src addr used to create the hash
+ * might not be exactly the passed in saddr which
+ * is a /128 addr from the flow.
+ * So we need to use f6i->fib6_src to redo lookup
+ * if the passed in saddr does not find anything.
+ * (See the logic in ip6_rt_cache_alloc() on how
+ * rt->rt6i_src is updated.)
*/
if (res->f6i->fib6_src.plen)
src_key = saddr;
+find_ex:
#endif
+ bucket = rcu_dereference(res->f6i->rt6i_exception_bucket);
rt6_ex = __rt6_find_exception_rcu(&bucket, daddr, src_key);
if (rt6_ex && !rt6_check_expired(rt6_ex->rt6i))
ret = rt6_ex->rt6i;
+#ifdef CONFIG_IPV6_SUBTREES
+ /* Use fib6_src as src_key and redo lookup */
+ if (!ret && src_key == saddr) {
+ src_key = &res->f6i->fib6_src.addr;
+ goto find_ex;
+ }
+#endif
+
return ret;
}
@@ -2683,12 +2696,22 @@ u32 ip6_mtu_from_fib6(const struct fib6_result *res,
#ifdef CONFIG_IPV6_SUBTREES
if (f6i->fib6_src.plen)
src_key = saddr;
+find_ex:
#endif
-
bucket = rcu_dereference(f6i->rt6i_exception_bucket);
rt6_ex = __rt6_find_exception_rcu(&bucket, daddr, src_key);
if (rt6_ex && !rt6_check_expired(rt6_ex->rt6i))
mtu = dst_metric_raw(&rt6_ex->rt6i->dst, RTAX_MTU);
+#ifdef CONFIG_IPV6_SUBTREES
+ /* Similar logic as in rt6_find_cached_rt().
+ * We need to use f6i->fib6_src to redo lookup in exception
+ * table if saddr did not yield any result.
+ */
+ else if (src_key == saddr) {
+ src_key = &f6i->fib6_src.addr;
+ goto find_ex;
+ }
+#endif
if (likely(!mtu)) {
struct net_device *dev = nh->fib_nh_dev;
--
2.21.0.1020.gf2820cf01a-goog
Powered by blists - more mailing lists