netdev - Re: [PATCH] net: Add ndo_gso

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEP_g=_sPBCeSO3nQfy-HJ68HqCgRU+CcA_09bKDL9fgehtveA@mail.gmail.com>
Date:	Wed, 8 Oct 2014 17:30:21 -0700
From:	Jesse Gross <jesse@...ira.com>
To:	Tom Herbert <therbert@...gle.com>
Cc:	Or Gerlitz <gerlitz.or@...il.com>,
	Alexander Duyck <alexander.h.duyck@...el.com>,
	John Fastabend <john.r.fastabend@...el.com>,
	Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
	David Miller <davem@...emloft.net>,
	Linux Netdev List <netdev@...r.kernel.org>,
	Thomas Graf <tgraf@...g.ch>,
	Pravin Shelar <pshelar@...ira.com>,
	Andy Zhou <azhou@...ira.com>
Subject: Re: [PATCH] net: Add ndo_gso_check

On Mon, Oct 6, 2014 at 5:17 PM, Tom Herbert <therbert@...gle.com> wrote:
> On Mon, Oct 6, 2014 at 3:33 PM, Jesse Gross <jesse@...ira.com> wrote:
>> On Mon, Oct 6, 2014 at 10:59 AM, Tom Herbert <therbert@...gle.com> wrote:
>>> On Sun, Oct 5, 2014 at 12:13 PM, Or Gerlitz <gerlitz.or@...il.com> wrote:
>>>> On Sun, Oct 5, 2014 at 9:49 PM, Tom Herbert <therbert@...gle.com> wrote:
>>>>> On Sun, Oct 5, 2014 at 7:04 AM, Or Gerlitz <gerlitz.or@...il.com> wrote:
>>>>>> On Thu, Oct 2, 2014 at 2:06 AM, Tom Herbert <therbert@...gle.com> wrote:
>>>>>>> On Wed, Oct 1, 2014 at 1:58 PM, Or Gerlitz <gerlitz.or@...il.com> wrote:
>>>>>>>> On Tue, Sep 30, 2014 at 6:34 PM, Tom Herbert <therbert@...gle.com> wrote:
>>>>>>>>> On Tue, Sep 30, 2014 at 7:30 AM, Or Gerlitz <gerlitz.or@...il.com> wrote:
>>>>>> [...]
>>>>>>> Solution #4: apply this patch and implement the check functions as
>>>>>>> needed in those 4 or 5 drivers. If a device can only do VXLAN/NVGRE
>>>>>>> then I believe the check function is something like:
>>>>>>>
>>>>>>> bool mydev_gso_check(struct sk_buff *skb, struct net_device *dev)
>>>>>>> {
>>>>>>>         if ((skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL) &&
>>>>>>>             ((skb->inner_protocol_type != ENCAP_TYPE_ETHER ||
>>>>>>>               skb->protocol != htons(ETH_P_TEB) ||
>>>>>>>               skb_inner_mac_header(skb) - skb_transport_header(skb) != 12)
>>>>>>>                 return false;
>>>>>>>
>>>>>>>         return true;
>>>>>>> }
>>>>>>
>>>>>> Yep, such helper can can be basically made to work and let the 4-5
>>>>>> drivers that can
>>>>>> do GSO offloading for vxlan but not for any FOU/GUE packets signal
>>>>>> that to the stack.
>>>>>>
>>>>>> Re the 12 constant, you were referring to the udp+vxlan headers? it's 8+8
>>>>>>
>>>>>> Also, we need a way for drivers that can support VXLAN or NVGRE but
>>>>>> not concurrently
>>>>>> on the same port @ the same time to only let vxlan packet to pass
>>>>>> successfully through the helper.
>>>>
>>>>> Or, there should be no difference in GSO processing between VXLAN and
>>>>> NVGRE. Can you explain why you feel you need to differentiate them for GSO?
>>>>
>>>>
>>>> RX wise, Linux tells the driver that UDP port X would be used for
>>>> VXLAN, right? and indeed, it's possible for some HW implementations
>>>> not to support RX offloading (checksum) for both VXLAN and NVGRE @ the
>>>> same time over the same port. But TX/GRO wise, you're probably
>>>> correct. The thing is that from the user POV they need solution that
>>>> works for both RX and TX offloading.
>>>
>>> I think from a user POV we want a solution that supports RX and TX
>>> offloading across the widest range of protocols. This is accomplished
>>> by implementing protocol agnostic mechanisms like CHECKSUM_COMPLETE
>>> and protocol agnostic UDP tunnel TSO like we've described. IMO, the
>>> fact that we have devices that implement protocol specific mechanisms
>>> for NVGRE and VXLAN should be considered legacy support in the stack,
>>> for new UDP encapsulation protocols we should not expose specifics in
>>> the stack in either by adding a GSO type for each protocol, nor
>>> ndo_add_foo_port for each protocol-- these things will not scale and
>>> unnecessarily complicate the core stack.
>>
>> It's not clear to me that allowing devices to know what protocols are
>> running on what ports actually complicates the stack. The part that is
>> complicated is usually the types of operations that are being
>> offloaded (checksum, TSO, etc.). In all of these tunnel cases, the
>> operations are same and if you have a clean registration mechanism
>> then nothing in the core has to see this - only the protocol doing the
>> registering and the driver that is supporting it.
>>
>
> We already have an ntuple filtering interface that allows configuring
> a device for special processing of RX packets. I don't see why that
> shouldn't apply to the use case protocol processing for specific ports
> in the encapsulation use case.

You mentioned this before but I guess I don't really understand it. I
suppose it is possible to express the port number and encapsulation as
a filter but it doesn't really seem all that natural and at the end of
the day it won't be mapped to a filter in the NIC. Can you explain it
some more?

>> I have no disagreement with trying to be generic across protocols. I'm
>> just not convinced that it is a realistic plan. It's obvious that it
>> is not doable today nor will be it be in the next generation of NICs
>> (which are guaranteed to add support for new protocols). Furthermore,
>> there will be more advanced stuff coming in the future that I think
>> will be difficult or impossible to make protocol agnostic. Rather than
>> pretending that this doesn't exist or will never happen, it's better
>> focus on how to integrating it cleanly.
>
> Sorry, but I don't understand how supporting a new protocols in a
> device for the purposes of returning CHECKSUM_UNNECESSARY is better or
> easier to implement than just returning CHECKSUM_COMPLETE. Same thing
> for trying to use NETIF_F_IP_CSUM with encapsulation rather than
> NETIF_F_HW_CSUM. I'm not a hardware guy, so it's possible I'm missing
> something obvious...
>
> Can you be more specific about this "advanced stuff"?

I think checksums are really the exception, not the rule. It's great
that they have this nice property of being additive and we should use
that where we can but that doesn't apply to other types of operation
(or even other types of checksums). Encryption or CRC32 carried inside
the tunnel header can't be accelerated without some additional
knowledge of the protocol. I think there were also a few other things
that came up along these lines when we talked about this in an earlier
thread - that's what I mean by "advanced stuff".

For the basic one's complement checksums, I have no objection to
CHECKSUM_COMPLETE. However, the reality is that this is not generally
implemented today and that is unlikely to change for a few years even
in the best case.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html