netdev - Re: [PATCH net-next] veth: extend features to support tunneling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAMEtUuzkwVSetLKb0AK7U=iB5ArhAhs-0vx9cX+WeZ2Fz04kqg@mail.gmail.com>
Date:	Sat, 16 Nov 2013 23:31:08 -0800
From:	Alexei Starovoitov <ast@...mgrid.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Or Gerlitz <or.gerlitz@...il.com>,
	David Miller <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Stephen Hemminger <stephen@...workplumber.org>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"Michael S. Tsirkin" <mst@...hat.com>,
	John Fastabend <john.r.fastabend@...el.com>
Subject: Re: [PATCH net-next] veth: extend features to support tunneling

On Sat, Nov 16, 2013 at 1:40 PM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> On Sat, 2013-11-16 at 23:11 +0200, Or Gerlitz wrote:
>
>> Guys (thanks Eric for the clarification over the other vxlan thread),
>> with the latest networking code (e.g 3.12 or net-next)  do you expect
>> notable performance (throughput) difference between these two configs?
>>
>> 1. bridge --> vxlan --> NIC
>> 2. veth --> bridge --> vxlan --> NIC
>>
>> BTW #2 doesn't work when packets start to be large unless I manually
>> decrease the veth device pair MTU. E.g if the NIC MTU is 1500, vxlan
>> advertizes an MTU of 1450 (= 1500 - (14 + 20 + 8 + 8)) and the bridge
>> inherits that, but not the veth device. Should someone/somewhere here
>> generate an ICMP packet which will cause the stack to decreate the
>> path mtu for the neighbour created on the veth device? what about
>> para-virtualized guests which are plugged into this (or any host based
>> tunneling) scheme, e.g in this scheme
>>
>> 3. guest virtio NIC --> vhost  --> tap/macvtap --> bridge --> vxlan --> NIC
>>
>> Who/how do we want the guest NIC mtu/path mtu to take into account the
>> tunneling over-head?
>
> I mentioned this problem on another thread : gso packets escape the
> normal mtu checks in ip forwarding.
>
> vi +91 net/ipv4/ip_forward.c
>
> gso_size contains the size of the segment minus all headers.

In case of VMs sending gso packets over tap and tunnel in the host,
ip_forward is not in the picture.

when host mtu doesn't account for overhead of tunnel, the neat trick
we can do is to decrease gso_size while adding tunnel header.
This way when skb_gso_segment() kicks in during tx the packets will be
segmented into host mtu sized packets.
Receiving vm on the other side will be seeing packets of size
guest_mtu - tunnel_header_size,
but imo that's much better than sending ip fragments over vxlan fabric.
It will work for guests sending tcp/udp, but there is no good solution
for icmp other than ip frags.
This trick should work for hw offloaded vxlan, but we yet to
experiment with it on such nic.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html