[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0UcaapGYAwdQ7udr0qN=r+8d_ii6MZw2uQy6Ct+tMtvfpA@mail.gmail.com>
Date: Fri, 20 Apr 2018 10:38:45 -0700
From: Alexander Duyck <alexander.duyck@...il.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: David Miller <davem@...emloft.net>,
Sowmini Varadhan <sowmini.varadhan@...cle.com>,
"Samudrala, Sridhar" <sridhar.samudrala@...el.com>,
Netdev <netdev@...r.kernel.org>,
Willem de Bruijn <willemb@...gle.com>
Subject: Re: [PATCH RFC net-next 00/11] udp gso
On Wed, Apr 18, 2018 at 11:22 AM, Willem de Bruijn
<willemdebruijn.kernel@...il.com> wrote:
> On Wed, Apr 18, 2018 at 2:12 PM, Alexander Duyck
> <alexander.duyck@...il.com> wrote:
>> On Wed, Apr 18, 2018 at 10:28 AM, David Miller <davem@...emloft.net> wrote:
>>> From: Sowmini Varadhan <sowmini.varadhan@...cle.com>
>>> Date: Wed, 18 Apr 2018 08:31:03 -0400
>>>
>>>> However, I share Sridhar's concerns about the very fundamental change
>>>> to UDP message boundary semantics here. There is actually no such thing
>>>> as a "segment" in udp, so in general this feature makes me a little
>>>> uneasy. Well behaved udp applications should already be sending mtu
>>>> sized datagrams. And the not-so-well-behaved ones are probably relying
>>>> on IP fragmentation/reassembly to take care of datagram boundary semantics
>>>> for them?
>>>>
>>>> As Sridhar points out, the feature is not really "negotiated" - one side
>>>> unilaterally sets the option. If the receiver is a classic/POSIX UDP
>>>> implementation, it will have no way of knowing that message boundaries
>>>> have been re-adjusted at the sender.
>>>
>>> There are no "semantics".
>>>
>>> What ends up on the wire is the same before the kernel/app changes as
>>> afterwards.
>>>
>>> The only difference is that instead of the application doing N - 1
>>> sendmsg() calls with mtu sized writes, it's giving everything all at
>>> once and asking the kernel to segment.
>>>
>>> It even gives the application control over the size of the packets,
>>> which I think is completely prudent in this situation.
>>
>> My only concern with the patch set is verifying what mitigations are
>> in case so that we aren't trying to set an MSS size that results in a
>> frame larger than MTU. I'm still digging through the code and trying
>> to grok it, but I figured I might just put the question out there to
>> may my reviewing easier.
>
> This is checked in udp_send_skb in
>
> const int hlen = skb_network_header_len(skb) +
> sizeof(struct udphdr);
>
> if (hlen + cork->gso_size > cork->fragsize)
> return -EINVAL;
>
> At this point cork->fragsize will have been set in ip_setup_cork
> based on device or path mtu:
>
> cork->fragsize = ip_sk_use_pmtu(sk) ?
> dst_mtu(&rt->dst) : rt->dst.dev->mtu;
>
> The feature bypasses the MTU sanity checks in __ip_append_data
> by setting local variable mtu to a network layer max
>
> mtu = cork->gso_size ? IP_MAX_MTU : cork->fragsize;
>
> but the above should perform the same check, only later. The
> check in udp_send_skb is simpler as it does not have to deal
> with the case of fragmentation.
>
>> Also any plans for HW offload support for this? I vaguely recall that
>> the igb and ixgbe parts had support for something like this in
>> hardware. I would have to double check to see what exactly is
>> supported.
>
> I hadn't given that much thought until the request yesterday to
> expose the NETIF_F_GSO_UDP_L4 flag through ethtool. By
> virtue of having only a single fixed segmentation length, it
> appears reasonably straightforward to offload.
Actually I just got a chance to start on a review of things. Do we
need to have to use both GSO_UDP and and GSO_UDP_L4? It might be
better if we could split these up and specifically call out GSO_UDP as
UFO and GSO_UDP_L4 as being UDP segmentation.
My only concern is that I don't think devices would want to have to
try and advertise both GSO_UDP and GSO_UDP_L4 if they want to support
UDP segmentation only. Ideally we want to keep UFO separate from UDP
segmentation since they would be two different things.
Thanks.
- Alex
Powered by blists - more mailing lists