[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5A90DA2E42F8AE43BC4A093BF0678848DED92B@SHSMSX104.ccr.corp.intel.com>
Date: Sun, 30 Nov 2014 10:08:32 +0000
From: "Du, Fan" <fan.du@...el.com>
To: 'Jason Wang' <jasowang@...hat.com>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
"fw@...len.de" <fw@...len.de>, "Du, Fan" <fan.du@...el.com>
Subject: RE: [PATCH net] gso: do GSO for local skb with size bigger than MTU
>-----Original Message-----
>From: Jason Wang [mailto:jasowang@...hat.com]
>Sent: Friday, November 28, 2014 3:02 PM
>To: Du, Fan
>Cc: netdev@...r.kernel.org; davem@...emloft.net; fw@...len.de; Du, Fan
>Subject: Re: [PATCH net] gso: do GSO for local skb with size bigger than MTU
>
>
>
>On Fri, Nov 28, 2014 at 2:33 PM, Fan Du <fan.du@...el.com> wrote:
>> Test scenario: two KVM guests sitting in different hosts communicate
>> to each other with a vxlan tunnel.
>>
>> All interface MTU is default 1500 Bytes, from guest point of view, its
>> skb gso_size could be as bigger as 1448Bytes, however after guest skb
>> goes through vxlan encapuslation, individual segments length of a gso
>> packet could exceed physical NIC MTU 1500, which will be lost at
>> recevier side.
>>
>> So it's possible in virtualized environment, locally created skb len
>> after encapslation could be bigger than underlayer MTU. In such case,
>> it's reasonable to do GSO first, then fragment any packet bigger than
>> MTU as possible.
>>
>> +---------------+ TX RX +---------------+
>> | KVM Guest | -> ... -> | KVM Guest |
>> +-+-----------+-+ +-+-----------+-+
>> |Qemu/VirtIO| |Qemu/VirtIO|
>> +-----------+ +-----------+
>> | |
>> v tap0 tap0 v
>> +-----------+ +-----------+
>> | ovs bridge| | ovs bridge|
>> +-----------+ +-----------+
>> | vxlan vxlan |
>> v v
>> +-----------+ +-----------+
>> | NIC | <------> | NIC |
>> +-----------+ +-----------+
>>
>> Steps to reproduce:
>> 1. Using kernel builtin openvswitch module to setup ovs bridge.
>> 2. Runing iperf without -M, communication will stuck.
>
>Is this issue specific to ovs or ipv4? Path MTU discovery should help in this case I
>believe.
Problem here is host stack push local over-sized gso skb down to NIC, and perform GSO there
without any further ip segmentation.
Reasonable behavior is do gso first at ip level, if gso-ed skb is bigger than MTU && df is set,
Then push ICMP_DEST_UNREACH/ICMP_FRAG_NEEDED message back to sender to adjust mtu.
For PMTU to work, that's another issue I will try to address later on.
>>
>>
>> Signed-off-by: Fan Du <fan.du@...el.com>
>> ---
>> net/ipv4/ip_output.c | 7 ++++---
>> 1 files changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index
>> bc6471d..558b5f8 100644
>> --- a/net/ipv4/ip_output.c
>> +++ b/net/ipv4/ip_output.c
>> @@ -217,9 +217,10 @@ static int ip_finish_output_gso(struct sk_buff
>> *skb)
>> struct sk_buff *segs;
>> int ret = 0;
>>
>> - /* common case: locally created skb or seglen is <= mtu */
>> - if (((IPCB(skb)->flags & IPSKB_FORWARDED) == 0) ||
>> - skb_gso_network_seglen(skb) <= ip_skb_dst_mtu(skb))
>> + /* Both locally created skb and forwarded skb could exceed
>> + * MTU size, so make a unified rule for them all.
>> + */
>> + if (skb_gso_network_seglen(skb) <= ip_skb_dst_mtu(skb))
>> return ip_finish_output2(skb);
>>
>> /* Slowpath - GSO segment length is exceeding the dst MTU.
>> --
>> 1.7.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@...r.kernel.org More majordomo info
>> at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists