netdev - Re: [PATCH net-next] veth: extend features to support tunneling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1384638027.8604.22.camel@edumazet-glaptop2.roam.corp.google.com>
Date:	Sat, 16 Nov 2013 13:40:27 -0800
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Or Gerlitz <or.gerlitz@...il.com>
Cc:	Alexei Starovoitov <ast@...mgrid.com>,
	David Miller <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Stephen Hemminger <stephen@...workplumber.org>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"Michael S. Tsirkin" <mst@...hat.com>,
	John Fastabend <john.r.fastabend@...el.com>
Subject: Re: [PATCH net-next] veth: extend features to support tunneling

On Sat, 2013-11-16 at 23:11 +0200, Or Gerlitz wrote:

> Guys (thanks Eric for the clarification over the other vxlan thread),
> with the latest networking code (e.g 3.12 or net-next)  do you expect
> notable performance (throughput) difference between these two configs?
> 
> 1. bridge --> vxlan --> NIC
> 2. veth --> bridge --> vxlan --> NIC
> 
> BTW #2 doesn't work when packets start to be large unless I manually
> decrease the veth device pair MTU. E.g if the NIC MTU is 1500, vxlan
> advertizes an MTU of 1450 (= 1500 - (14 + 20 + 8 + 8)) and the bridge
> inherits that, but not the veth device. Should someone/somewhere here
> generate an ICMP packet which will cause the stack to decreate the
> path mtu for the neighbour created on the veth device? what about
> para-virtualized guests which are plugged into this (or any host based
> tunneling) scheme, e.g in this scheme
> 
> 3. guest virtio NIC --> vhost  --> tap/macvtap --> bridge --> vxlan --> NIC
> 
> Who/how do we want the guest NIC mtu/path mtu to take into account the
> tunneling over-head?

I mentioned this problem on another thread : gso packets escape the
normal mtu checks in ip forwarding.

vi +91 net/ipv4/ip_forward.c

gso_size contains the size of the segment minus all headers.

Please try the following :

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index d68633452d9b..489b56935a56 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -388,4 +388,16 @@ static inline int fastopen_init_queue(struct sock *sk, int backlog)
 	return 0;
 }
 
+static inline unsigned int gso_size_with_headers(const struct sk_buff *skb)
+{
+	unsigned int hdrlen = skb_transport_header(skb) - skb_mac_header(skb);
+
+	if (skb_shinfo(skb)->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6))
+		hdrlen += tcp_hdrlen(skb);
+	else
+		hdrlen += 8; // sizeof(struct udphdr)
+
+	return skb_shinfo(skb)->gso_size + hdrlen;
+}
+
 #endif	/* _LINUX_TCP_H */
diff --git a/net/ipv4/ip_forward.c b/net/ipv4/ip_forward.c
index 694de3b7aebf..3949cc1dd1ca 100644
--- a/net/ipv4/ip_forward.c
+++ b/net/ipv4/ip_forward.c
@@ -57,6 +57,7 @@ int ip_forward(struct sk_buff *skb)
 	struct iphdr *iph;	/* Our header */
 	struct rtable *rt;	/* Route we use */
 	struct ip_options *opt	= &(IPCB(skb)->opt);
+	unsigned int len;
 
 	if (skb_warn_if_lro(skb))
 		goto drop;
@@ -88,7 +89,11 @@ int ip_forward(struct sk_buff *skb)
 	if (opt->is_strictroute && rt->rt_uses_gateway)
 		goto sr_failed;
 
-	if (unlikely(skb->len > dst_mtu(&rt->dst) && !skb_is_gso(skb) &&
+	len = skb->len;
+	if (skb_is_gso(skb))
+		len = gso_size_with_headers(skb);
+
+	if (unlikely(len > dst_mtu(&rt->dst) &&
 		     (ip_hdr(skb)->frag_off & htons(IP_DF))) && !skb->local_df) {
 		IP_INC_STATS(dev_net(rt->dst.dev), IPSTATS_MIB_FRAGFAILS);
 		icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html