lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 4 Aug 2014 16:13:16 -0700 From: Tom Herbert <therbert@...gle.com> To: Andy Zhou <azhou@...ira.com> Cc: Jesse Gross <jesse@...ira.com>, David Miller <davem@...emloft.net>, Linux Netdev List <netdev@...r.kernel.org> Subject: Re: [net-next 1/5] net: Add rx-vxlan feature bit On Mon, Aug 4, 2014 at 3:00 PM, Andy Zhou <azhou@...ira.com> wrote: > Hi, Tom, > > Do you envision those configuration can be applied dynamically, as the > ports are added and removed under programming API control, for > example, openFlow? If yes, what would be appropriate API/protocol? > I think it's more generic to use an out of band mechanism to this to cover all use cases. For instance, if I was implementing a simple router there might not even be a kernel socket or even any socket opened for the port, but I might still want the advanced steering capabilities for certain ports. Actually, I'm wondering if this can just be part of rx-filtering, that already allows specifying rules by l4proto and dst_port and action could be an indication to device of which special processing to apply. > Given that we have not reached consensus on the value and approach of > receive side offloading, I plan to withdraw this patch from this patch > series so that we > can concentrate on common UDP tunnel functions, with vxlan and L2tp > re-factoring so that we minimize code duplication when introducing > Geneve support. > Thanks for considering both vxlan and l2tp in these patches! > I will send out the next rev this series in a few days. > > Thanks, > > Andy > > On Sat, Aug 2, 2014 at 12:21 PM, Tom Herbert <therbert@...gle.com> wrote: >> On Sat, Aug 2, 2014 at 11:16 AM, Jesse Gross <jesse@...ira.com> wrote: >>> On Sat, Aug 2, 2014 at 8:53 AM, Tom Herbert <therbert@...gle.com> wrote: >>>> On Fri, Aug 1, 2014 at 2:54 PM, Andy Zhou <azhou@...ira.com> wrote: >>>>> Add a ndo feature bit that controls turning on and off NIC VXLAN >>>>> receive offload >>>>> >>>>> Signed-off-by: Andy Zhou <azhou@...ira.com> >>>>> --- >>>>> Documentation/networking/netdev-features.txt | 5 +++++ >>>>> drivers/net/ethernet/emulex/benet/be_main.c | 6 ++++-- >>>>> drivers/net/ethernet/intel/i40e/i40e_main.c | 1 + >>>>> drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 11 ++++++++--- >>>>> drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c | 7 +++++-- >>>>> include/linux/netdev_features.h | 4 ++++ >>>>> net/core/ethtool.c | 1 + >>>>> 7 files changed, 28 insertions(+), 7 deletions(-) >>>>> >>>>> diff --git a/Documentation/networking/netdev-features.txt b/Documentation/networking/netdev-features.txt >>>>> index f310ede..6436768 100644 >>>>> --- a/Documentation/networking/netdev-features.txt >>>>> +++ b/Documentation/networking/netdev-features.txt >>>>> @@ -165,3 +165,8 @@ This requests that the NIC receive all possible frames, including errored >>>>> frames (such as bad FCS, etc). This can be helpful when sniffing a link with >>>>> bad packets on it. Some NICs may receive more packets if also put into normal >>>>> PROMISC mode. >>>>> + >>>>> +* rx-vxlan >>>>> +NIC can parse udp tunneled frames. The NIC may make use of the inner frame >>>>> +information, in addition to the outer UDP frame to verify, offload(RX) >>>>> +and steer the packet. >>>> >>>> Sorry Andy, I still don't like this feature! I am worried that we are >>>> opening a can of worms in encouraging HW vendors to focus on narrow >>>> protocol specific features as opposed to general features that can be >>>> applied to a wider range of protocols. >>>> >>>> For UDP tunneling protocols where source port is set to a hash this >>>> should sufficient for the most common purposes of RSS and ECMP. This >>>> is precisely why we are interested in UDP encapsulation in the first >>>> place, existing NICs and switches support it. Any extra value in HW >>>> parsing VXLAN for steering is marginal (at least not a compelling >>>> reason for me run out and buy new NICs). >>>> >>>> This functionality might have merit for UDP flows where the source >>>> port cannot be arbitrary, like what would be needed to get the UDP >>>> flow to work over a stateful NAT. An example of this might be in QUIC. >>>> In the case the 5-tuple might not be sufficient the inner flow, but a >>>> device could conceivably parse the packet and use the QUIC connection >>>> identifier for flow identification and steering. However, if this >>>> becomes a real HW feature we would need a different interface anyway. >>>> QUIC is implemented in user space and to the kernel is just another >>>> UDP socket. The correct interface would probably be to configure the >>>> capability in ethtool, not through the UDP stack. >>> >>> This actually is a patch against ethtool :) >>> >> What I mean by using ethtool is that it should provide the generic >> interface to enable per UDP port HW-specific functionality (or more >> generally match to some protocol fields filter). This entails sending >> commands to the device specifying port and key meaningful to the >> device on how to process packets on that port. This is similar to how >> we can program the RSS table or hashing algorithms without having the >> stack needing to do anything or track its tunnels. >> >>> I think this is probably getting somewhat ideological or at least use >>> case-specific. I understand your goal is to be generic where possible >>> but enough examples have come up in this discussion that it seems like >>> there shouldn't be much of an argument about whether this is useful at >>> this point. Things like the cost/benefit of buying a new NIC or how >>> QUIC would use it in the future seem a little beside the point, >>> honestly. >>> >>> In any case, this specific patch is just about giving people the >>> ability to control VXLAN offloads, which I think should make you happy >>> given that it is already present in the stack and on by default. I >>> think Andy has done his best to take your comments into account to >>> make this as clean as possible and on the larger point we probably >>> just need to move on. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists