netdev - Re: [PATCH v3] veth: Drop MTU check when forwarding packets

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALttK1TWBeZWDwHoW9q6qkT6=XT4EmZM1ZbK3KtKSXR-ZcAFeA@mail.gmail.com>
Date: Wed, 14 Aug 2024 14:42:45 +0800
From: Duan Jiong <djduanjiong@...il.com>
To: Toke Høiland-Jørgensen <toke@...nel.org>
Cc: Willem de Bruijn <willemdebruijn.kernel@...il.com>, netdev@...r.kernel.org
Subject: Re: [PATCH v3] veth: Drop MTU check when forwarding packets

On Tue, Aug 13, 2024 at 7:40 PM Toke Høiland-Jørgensen <toke@...nel.org> wrote:
>
> Duan Jiong <djduanjiong@...il.com> writes:
>
> >> >
> >
> > vm1(mtu 1600)---ovs---ipsec vpn1(mtu 1500)---ipsec vpn2(mtu
> > 1500)---ovs---vm2(mtu 1600)
>
> Where's the veth device in this setup?
>

The veth device is used for ipsec vpn containers to connect to ovs, and
traffic before and after esp encapsulation goes to this NIC.


> > My scenario is that two vms are communicating via ipsec vpn gateway,
> > the two vpn gateways are interconnected via public network, the vpn
> > gateway has only one NIC, single arm mode. vpn gateway mtu will be
> > 1500 in general, but the packets sent by the vm's to the vpn gateway
> > may be more than 1500, and at this time, if implemented according to
> > the existing veth driver, the packets sent by the vm's will be
> > discarded. If allowed to receive large packets, the vpn gateway can
> > actually accept large packets then esp encapsulate them and then
> > fragment so that in the end it doesn't affect the connectivity of the
> > network.
>
> I'm not sure I quite get the setup; it sounds like you want a subset of
> the traffic to adhere to one MTU, and another subset to adhere to a
> different MTU, on the same interface? Could you not divide the traffic
> over two different interfaces (with different MTUs) instead?
>

This is indeed a viable option, but it's not easy to change our own
implementation right now, so we're just seeing if it's feasible to skip
the veth mtu check.


> >> > Agreed that it has a risk, so some justification is in order. Similar
> >> > to how commit 5f7d57280c19 (" bpf: Drop MTU check when doing TC-BPF
> >> > redirect to ingress") addressed a specific need.
> >>
> >> Exactly :)
> >>
> >> And cf the above, using netkit may be an alternative that doesn't carry
> >> this risk (assuming that's compatible with the use case).
> >>
> >> -Toke
> >
> >
> > I can see how there could be a potential risk here, can we consider
> > adding a switchable option to control this behavior?
>
> Hmm, a toggle has its own cost in terms of complexity and overhead. Plus
> it's adding new UAPI. It may be that this is the least bad option in the
> end, but before going that route we should be very sure that there's not
> another way to solve your problem (cf the above).
>
> This has been discussed before, BTW, most recently five-and-some
> years ago:
>
> https://patchwork.ozlabs.org/project/netdev/patch/CAMJ5cBHZ4DqjE6Md-0apA8aaLLk9Hpiypfooo7ud-p9XyFyeng@mail.gmail.com/
>
> -Toke