[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF=yD-+V7B67NbL8aELa9QR0Hx8bAvKNQ=JfjkwGGFtey-2FOw@mail.gmail.com>
Date: Wed, 18 Apr 2018 14:22:35 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Alexander Duyck <alexander.duyck@...il.com>
Cc: David Miller <davem@...emloft.net>,
Sowmini Varadhan <sowmini.varadhan@...cle.com>,
"Samudrala, Sridhar" <sridhar.samudrala@...el.com>,
Netdev <netdev@...r.kernel.org>,
Willem de Bruijn <willemb@...gle.com>
Subject: Re: [PATCH RFC net-next 00/11] udp gso
On Wed, Apr 18, 2018 at 2:12 PM, Alexander Duyck
<alexander.duyck@...il.com> wrote:
> On Wed, Apr 18, 2018 at 10:28 AM, David Miller <davem@...emloft.net> wrote:
>> From: Sowmini Varadhan <sowmini.varadhan@...cle.com>
>> Date: Wed, 18 Apr 2018 08:31:03 -0400
>>
>>> However, I share Sridhar's concerns about the very fundamental change
>>> to UDP message boundary semantics here. There is actually no such thing
>>> as a "segment" in udp, so in general this feature makes me a little
>>> uneasy. Well behaved udp applications should already be sending mtu
>>> sized datagrams. And the not-so-well-behaved ones are probably relying
>>> on IP fragmentation/reassembly to take care of datagram boundary semantics
>>> for them?
>>>
>>> As Sridhar points out, the feature is not really "negotiated" - one side
>>> unilaterally sets the option. If the receiver is a classic/POSIX UDP
>>> implementation, it will have no way of knowing that message boundaries
>>> have been re-adjusted at the sender.
>>
>> There are no "semantics".
>>
>> What ends up on the wire is the same before the kernel/app changes as
>> afterwards.
>>
>> The only difference is that instead of the application doing N - 1
>> sendmsg() calls with mtu sized writes, it's giving everything all at
>> once and asking the kernel to segment.
>>
>> It even gives the application control over the size of the packets,
>> which I think is completely prudent in this situation.
>
> My only concern with the patch set is verifying what mitigations are
> in case so that we aren't trying to set an MSS size that results in a
> frame larger than MTU. I'm still digging through the code and trying
> to grok it, but I figured I might just put the question out there to
> may my reviewing easier.
This is checked in udp_send_skb in
const int hlen = skb_network_header_len(skb) +
sizeof(struct udphdr);
if (hlen + cork->gso_size > cork->fragsize)
return -EINVAL;
At this point cork->fragsize will have been set in ip_setup_cork
based on device or path mtu:
cork->fragsize = ip_sk_use_pmtu(sk) ?
dst_mtu(&rt->dst) : rt->dst.dev->mtu;
The feature bypasses the MTU sanity checks in __ip_append_data
by setting local variable mtu to a network layer max
mtu = cork->gso_size ? IP_MAX_MTU : cork->fragsize;
but the above should perform the same check, only later. The
check in udp_send_skb is simpler as it does not have to deal
with the case of fragmentation.
> Also any plans for HW offload support for this? I vaguely recall that
> the igb and ixgbe parts had support for something like this in
> hardware. I would have to double check to see what exactly is
> supported.
I hadn't given that much thought until the request yesterday to
expose the NETIF_F_GSO_UDP_L4 flag through ethtool. By
virtue of having only a single fixed segmentation length, it
appears reasonably straightforward to offload.
Powered by blists - more mailing lists