netdev - Re: [PATCH net-next 1/3] udp_tunnel: allow to turn off path mtu discovery on encap sockets

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200713175709.2a547d7c@redhat.com>
Date:   Mon, 13 Jul 2020 17:57:09 +0200
From:   Stefano Brivio <sbrivio@...hat.com>
To:     Florian Westphal <fw@...len.de>
Cc:     David Ahern <dsahern@...il.com>, netdev@...r.kernel.org,
        aconole@...hat.com
Subject: Re: [PATCH net-next 1/3] udp_tunnel: allow to turn off path mtu
 discovery on encap sockets

On Mon, 13 Jul 2020 16:59:11 +0200
Florian Westphal <fw@...len.de> wrote:

> Its configured properly:
> 
> ovs bridge mtu: 1450
> vxlan device mtu: 1450
> physical link: 1500

Okay, so my proposal to reflect the discovered PMTU on the MTU of the
VXLAN device won't help in your case.

In the test case I drafted, configuring bridge and VXLAN with those
MTUs (by means of PMTU discovery) is enough for the sender to adjust
packet size and MTU-sized packets go through. I guess the OVS case is
not equivalent to it, then.

> so, packets coming in on the bridge (local tx or from remote bridge port)
> can have the enap header (50 bytes) prepended without exceeding the
> physical link mtu.
> 
> When the vxlan driver calls the ip output path, this line:
> 
>         mtu = ip_skb_dst_mtu(sk, skb);
> 
> in __ip_finish_output() will fetch the MTU based of the encap socket,
> which will now be 1450 due to that route exception.
> 
> So this will behave as if someone had lowered the physical link mtu to 1450:
> IP stack drops the packet and sends an icmp error (fragmentation needed,
> MTU 1450).  The MTU of the VXLAN port is already at 1450.

It's not clear to me why the behaviour on this path is different from
routed traffic. I understand the impact of bridged traffic on error
reporting, but not here.

Does it have something to do with metadata-based tunnels? Should we omit
the call to skb_tunnel_check_pmtu() call in vxlan_xmit_one() in that
case (if (info)) because the dst is not the same dst?

> [...]
>
> I don't think this patch is enough to resolve PMTU in general of course,
> after all the VXLAN peer might be unable to receive packets larger than
> what the ICMP error announces.  But I do not know how to resolve this
> in the general case as everyone has a differnt opinion on how (and where)
> this needs to be handled.

The sender here is sending packets matching the MTU, interface MTUs are
correct, so we wouldn't benefit from "extending" PMTU discovery for
this specific problem and we can let that topic aside for now, correct?

-- 
Stefano