netdev - Re: [PATCH net] gso: do GSO for local skb with size bigger than MTU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20141201135225.GA16814@casper.infradead.org>
Date:	Mon, 1 Dec 2014 13:52:25 +0000
From:	Thomas Graf <tgraf@...g.ch>
To:	"Du, Fan" <fan.du@...el.com>
Cc:	'Jason Wang' <jasowang@...hat.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"davem@...emloft.net" <davem@...emloft.net>,
	"fw@...len.de" <fw@...len.de>, dev@...nvswitch.org, mst@...hat.com,
	jesse@...ira.com, pshelar@...ira.com
Subject: Re: [PATCH net] gso: do GSO for local skb with size bigger than MTU

On 11/30/14 at 10:08am, Du, Fan wrote:
> >-----Original Message-----
> >From: Jason Wang [mailto:jasowang@...hat.com]
> >Sent: Friday, November 28, 2014 3:02 PM
> >To: Du, Fan
> >Cc: netdev@...r.kernel.org; davem@...emloft.net; fw@...len.de; Du, Fan
> >Subject: Re: [PATCH net] gso: do GSO for local skb with size bigger than MTU
> >On Fri, Nov 28, 2014 at 2:33 PM, Fan Du <fan.du@...el.com> wrote:
> >> Test scenario: two KVM guests sitting in different hosts communicate
> >> to each other with a vxlan tunnel.
> >>
> >> All interface MTU is default 1500 Bytes, from guest point of view, its
> >> skb gso_size could be as bigger as 1448Bytes, however after guest skb
> >> goes through vxlan encapuslation, individual segments length of a gso
> >> packet could exceed physical NIC MTU 1500, which will be lost at
> >> recevier side.
> >>
> >> So it's possible in virtualized environment, locally created skb len
> >> after encapslation could be bigger than underlayer MTU. In such case,
> >> it's reasonable to do GSO first, then fragment any packet bigger than
> >> MTU as possible.
> >>
> >> +---------------+ TX     RX +---------------+
> >> |   KVM Guest   | -> ... -> |   KVM Guest   |
> >> +-+-----------+-+           +-+-----------+-+
> >>   |Qemu/VirtIO|               |Qemu/VirtIO|
> >>   +-----------+               +-----------+
> >>        |                            |
> >>        v tap0                  tap0 v
> >>   +-----------+               +-----------+
> >>   | ovs bridge|               | ovs bridge|
> >>   +-----------+               +-----------+
> >>        | vxlan                vxlan |
> >>        v                            v
> >>   +-----------+               +-----------+
> >>   |    NIC    |    <------>   |    NIC    |
> >>   +-----------+               +-----------+
> >>
> >> Steps to reproduce:
> >>  1. Using kernel builtin openvswitch module to setup ovs bridge.
> >>  2. Runing iperf without -M, communication will stuck.
> >
> >Is this issue specific to ovs or ipv4? Path MTU discovery should help in this case I
> >believe.
> 
> Problem here is host stack push local over-sized gso skb down to NIC, and perform GSO there
> without any further ip segmentation.
> 
> Reasonable behavior is do gso first at ip level, if gso-ed skb is bigger than MTU && df is set, 
> Then push ICMP_DEST_UNREACH/ICMP_FRAG_NEEDED message back to sender to adjust mtu.

Aside from this. I think Virtio should provide a MTU hint to the guest
to adjust MTU in the vNIC to account for both overhead or support for
jumbo frames in the underlay transparently without relying on PMTU or
MSS hints. I remember we talked about this a while ago with at least
Michael but haven't done actual code work on it yet.

> For PMTU to work, that's another issue I will try to address later on.

PMTU discovery was explicitly removed from the OVS datapath. Maybe
Pravin or Jesse can provide some background on that
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html