[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1442385255-27014-1-git-send-email-wen.gang.wang@oracle.com>
Date: Wed, 16 Sep 2015 14:34:15 +0800
From: Wengang Wang <wen.gang.wang@...cle.com>
To: netdev@...r.kernel.org
Cc: wen.gang.wang@...cle.com
Subject: [PATCH] ip: find correct route for socket which is not bound to a device
For multi-cast, we should find valid route(thus get the meaniful pmtu) for
the package on the socket which is not bound to a device(sk_bound_dev_if
being 0) too.
>From man page of socket(7)
SO_BINDTODEVICE
Bind this socket to a particular device like “eth0”, as
specified in the passed interface name. If the name is an
empty string or the option length is zero, the socket
device binding is removed. The passed option is a
variable-length null-terminated interface name string with
the maximum size of IFNAMSIZ. If a socket is bound to an
interface, only packets received from that particular
interface are processed by the socket. Note that this works
only for some socket types, particularly AF_INET sockets.
It is not supported for packet sockets (use normal bind(2)
there).
The man page doesn't say when socket not bound packages won't be routed.
A problem is hit that all multi-cast packages dropped by kernel(from sender
host). The lower layer is IPoIB with MTU being 7000. And I was sending 4096
length multi-cast package. In side IPoIB the first send is dropped because
is exeeding the internal package size limitation mcast_mtu which is 2044.
So IPoIB calls ip_rt_update_pmtu (indirectly) trying to set path mtu. A
correct route is configured for the multi-cast, so the setting of pmtu
cucceeded and the next multi-cast package(to the same target) is expected
to succeed(it would be well fragmented accroding to the pmtu I just set).
But actually the second and later multi-cast packages got dropped too. And
the reason is that the neighor looking up(fib_lookup) is skipped because of
the socket is not bound to device(sk_bound_dev_if being 0). After applied
the patch I proposed here, it works fine.
Signed-off-by: Wengang Wang <wen.gang.wang@...cle.com>
---
net/ipv4/route.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 5f4a556..032481a 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2097,7 +2097,7 @@ struct rtable *__ip_route_output_key(struct net *net, struct flowi4 *fl4)
*/
fl4->flowi4_oif = dev_out->ifindex;
- goto make_route;
+ goto lookup;
}
if (!(fl4->flowi4_flags & FLOWI_FLAG_ANYSRC)) {
@@ -2153,6 +2153,7 @@ struct rtable *__ip_route_output_key(struct net *net, struct flowi4 *fl4)
goto make_route;
}
+lookup:
if (fib_lookup(net, fl4, &res, 0)) {
res.fi = NULL;
res.table = NULL;
--
2.1.0
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists