lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <CA+mtBx-Vz7=2zrVXNA=oZ1FaTuOzP10hbyxzzAgJWEr1GkCGjg@mail.gmail.com> Date: Wed, 8 Oct 2014 18:46:53 -0700 From: Tom Herbert <therbert@...gle.com> To: Jesse Gross <jesse@...ira.com> Cc: Or Gerlitz <gerlitz.or@...il.com>, Alexander Duyck <alexander.h.duyck@...el.com>, John Fastabend <john.r.fastabend@...el.com>, Jeff Kirsher <jeffrey.t.kirsher@...el.com>, David Miller <davem@...emloft.net>, Linux Netdev List <netdev@...r.kernel.org>, Thomas Graf <tgraf@...g.ch>, Pravin Shelar <pshelar@...ira.com>, Andy Zhou <azhou@...ira.com> Subject: Re: [PATCH] net: Add ndo_gso_check On Wed, Oct 8, 2014 at 5:30 PM, Jesse Gross <jesse@...ira.com> wrote: > On Mon, Oct 6, 2014 at 5:17 PM, Tom Herbert <therbert@...gle.com> wrote: >> On Mon, Oct 6, 2014 at 3:33 PM, Jesse Gross <jesse@...ira.com> wrote: >>> On Mon, Oct 6, 2014 at 10:59 AM, Tom Herbert <therbert@...gle.com> wrote: >>>> On Sun, Oct 5, 2014 at 12:13 PM, Or Gerlitz <gerlitz.or@...il.com> wrote: >>>>> On Sun, Oct 5, 2014 at 9:49 PM, Tom Herbert <therbert@...gle.com> wrote: >>>>>> On Sun, Oct 5, 2014 at 7:04 AM, Or Gerlitz <gerlitz.or@...il.com> wrote: >>>>>>> On Thu, Oct 2, 2014 at 2:06 AM, Tom Herbert <therbert@...gle.com> wrote: >>>>>>>> On Wed, Oct 1, 2014 at 1:58 PM, Or Gerlitz <gerlitz.or@...il.com> wrote: >>>>>>>>> On Tue, Sep 30, 2014 at 6:34 PM, Tom Herbert <therbert@...gle.com> wrote: >>>>>>>>>> On Tue, Sep 30, 2014 at 7:30 AM, Or Gerlitz <gerlitz.or@...il.com> wrote: >>>>>>> [...] >>>>>>>> Solution #4: apply this patch and implement the check functions as >>>>>>>> needed in those 4 or 5 drivers. If a device can only do VXLAN/NVGRE >>>>>>>> then I believe the check function is something like: >>>>>>>> >>>>>>>> bool mydev_gso_check(struct sk_buff *skb, struct net_device *dev) >>>>>>>> { >>>>>>>> if ((skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL) && >>>>>>>> ((skb->inner_protocol_type != ENCAP_TYPE_ETHER || >>>>>>>> skb->protocol != htons(ETH_P_TEB) || >>>>>>>> skb_inner_mac_header(skb) - skb_transport_header(skb) != 12) >>>>>>>> return false; >>>>>>>> >>>>>>>> return true; >>>>>>>> } >>>>>>> >>>>>>> Yep, such helper can can be basically made to work and let the 4-5 >>>>>>> drivers that can >>>>>>> do GSO offloading for vxlan but not for any FOU/GUE packets signal >>>>>>> that to the stack. >>>>>>> >>>>>>> Re the 12 constant, you were referring to the udp+vxlan headers? it's 8+8 >>>>>>> >>>>>>> Also, we need a way for drivers that can support VXLAN or NVGRE but >>>>>>> not concurrently >>>>>>> on the same port @ the same time to only let vxlan packet to pass >>>>>>> successfully through the helper. >>>>> >>>>>> Or, there should be no difference in GSO processing between VXLAN and >>>>>> NVGRE. Can you explain why you feel you need to differentiate them for GSO? >>>>> >>>>> >>>>> RX wise, Linux tells the driver that UDP port X would be used for >>>>> VXLAN, right? and indeed, it's possible for some HW implementations >>>>> not to support RX offloading (checksum) for both VXLAN and NVGRE @ the >>>>> same time over the same port. But TX/GRO wise, you're probably >>>>> correct. The thing is that from the user POV they need solution that >>>>> works for both RX and TX offloading. >>>> >>>> I think from a user POV we want a solution that supports RX and TX >>>> offloading across the widest range of protocols. This is accomplished >>>> by implementing protocol agnostic mechanisms like CHECKSUM_COMPLETE >>>> and protocol agnostic UDP tunnel TSO like we've described. IMO, the >>>> fact that we have devices that implement protocol specific mechanisms >>>> for NVGRE and VXLAN should be considered legacy support in the stack, >>>> for new UDP encapsulation protocols we should not expose specifics in >>>> the stack in either by adding a GSO type for each protocol, nor >>>> ndo_add_foo_port for each protocol-- these things will not scale and >>>> unnecessarily complicate the core stack. >>> >>> It's not clear to me that allowing devices to know what protocols are >>> running on what ports actually complicates the stack. The part that is >>> complicated is usually the types of operations that are being >>> offloaded (checksum, TSO, etc.). In all of these tunnel cases, the >>> operations are same and if you have a clean registration mechanism >>> then nothing in the core has to see this - only the protocol doing the >>> registering and the driver that is supporting it. >>> >> >> We already have an ntuple filtering interface that allows configuring >> a device for special processing of RX packets. I don't see why that >> shouldn't apply to the use case protocol processing for specific ports >> in the encapsulation use case. > > You mentioned this before but I guess I don't really understand it. I > suppose it is possible to express the port number and encapsulation as > a filter but it doesn't really seem all that natural and at the end of > the day it won't be mapped to a filter in the NIC. Can you explain it > some more? > With n-tuple filters you should be able to configure a rule to match packets (say by port) and assign an "action" which is understood by the driver (say assume packets are VXLAN and return CHECKSUM_UNNECESSARY). The interface is to the driver, so how this is actually instantiated on the device is a private matter. This is far more flexible and extensible model than trying to have the stack do port->protocol registration. We can filter on much than just destination port (this is actually a problem in UDP offloads which only works with binding socket to INADDR_ANY). Also, this model can be applied to many different scenarios not just encapsulation or those protocols that are implemented by the kernel. I imagine someone will want to do QUIC acceleration/steering in a device at some point, this work even if the kernel doesn't implement any part of the protocol (a design point of QUIC ;-) ). It would be interesting to see how the super charged programmable protocol parsers Alexei described might be integrated with RX filtering. >>> I have no disagreement with trying to be generic across protocols. I'm >>> just not convinced that it is a realistic plan. It's obvious that it >>> is not doable today nor will be it be in the next generation of NICs >>> (which are guaranteed to add support for new protocols). Furthermore, >>> there will be more advanced stuff coming in the future that I think >>> will be difficult or impossible to make protocol agnostic. Rather than >>> pretending that this doesn't exist or will never happen, it's better >>> focus on how to integrating it cleanly. >> >> Sorry, but I don't understand how supporting a new protocols in a >> device for the purposes of returning CHECKSUM_UNNECESSARY is better or >> easier to implement than just returning CHECKSUM_COMPLETE. Same thing >> for trying to use NETIF_F_IP_CSUM with encapsulation rather than >> NETIF_F_HW_CSUM. I'm not a hardware guy, so it's possible I'm missing >> something obvious... >> >> Can you be more specific about this "advanced stuff"? > > I think checksums are really the exception, not the rule. It's great > that they have this nice property of being additive and we should use > that where we can but that doesn't apply to other types of operation > (or even other types of checksums). Encryption or CRC32 carried inside > the tunnel header can't be accelerated without some additional > knowledge of the protocol. I think there were also a few other things > that came up along these lines when we talked about this in an earlier > thread - that's what I mean by "advanced stuff". > > For the basic one's complement checksums, I have no objection to > CHECKSUM_COMPLETE. However, the reality is that this is not generally > implemented today and that is unlikely to change for a few years even > in the best case. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists