[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+mtBx-ZvkZ2ALGLsLEN7Cgn9gN_rY36ZhuqYC3mZ9WVxjFDaQ@mail.gmail.com>
Date: Mon, 22 Sep 2014 08:10:08 -0700
From: Tom Herbert <therbert@...gle.com>
To: Thomas Graf <tgraf@...g.ch>
Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>,
Jiri Pirko <jiri@...nulli.us>,
John Fastabend <john.r.fastabend@...el.com>,
Jamal Hadi Salim <jhs@...atatu.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
Neil Horman <nhorman@...driver.com>,
Andy Gospodarek <andy@...yhouse.net>,
Daniel Borkmann <dborkman@...hat.com>,
Or Gerlitz <ogerlitz@...lanox.com>,
Jesse Gross <jesse@...ira.com>,
Pravin Shelar <pshelar@...ira.com>,
Andy Zhou <azhou@...ira.com>,
Ben Hutchings <ben@...adent.org.uk>,
Stephen Hemminger <stephen@...workplumber.org>,
Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
Vladislav Yasevich <vyasevic@...hat.com>,
Cong Wang <xiyou.wangcong@...il.com>,
Eric Dumazet <edumazet@...gle.com>,
Scott Feldman <sfeldma@...ulusnetworks.com>,
Florian Fainelli <f.fainelli@...il.com>,
Roopa Prabhu <roopa@...ulusnetworks.com>,
John Linville <linville@...driver.com>,
"dev@...nvswitch.org" <dev@...nvswitch.org>,
Jason Wang <jasowang@...hat.com>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Nicolas Dichtel <nicolas.dichtel@...nd.com>,
ryazanov.s.a@...il.com, Lennert Buytenhek <buytenh@...tstofly.org>,
aviadr@...lanox.com, Felix Fietkau <nbd@...nwrt.org>,
Neil Jerram <Neil.Jerram@...aswitch.com>, ronye@...lanox.com,
simon.horman@...ronome.com,
Alexander Duyck <alexander.h.duyck@...el.com>
Subject: Re: [patch net-next v2 8/9] switchdev: introduce Netlink API
On Mon, Sep 22, 2014 at 1:13 AM, Thomas Graf <tgraf@...g.ch> wrote:
> On 09/20/14 at 03:50pm, Alexei Starovoitov wrote:
>> I think HW should not be limited by SW abstractions whether
>> these abstractions are called flows, n-tuples, bridge or else.
>> Really looking forward to see "device reporting the headers as
>> header fields (len, offset) and the associated parse graph"
>> as the first step.
>>
>> Another topic that this discussion didn't cover yet is how this
>> all connects to tunnels and what is 'tunnel offloading'.
>> imo flow offloading by itself serves only academic interest.
>
> We haven't touched encryption yet either ;-)
>
> Certainly true for the host case. The Linux on TOR case is less
> dependant on this and L2/L3 offload w/o encap already has value.
>
Thomas, can you (or someone else) quantify what the host case is. I
suppose there may be merit in using a switch on NIC for kernel bypass
scenarios, but I'm still having a hard time understanding how this
could be integrated into the host stack with benefits that outweigh
complexity. The history of stateful offloads in NICs is not great, and
encapsulation (stuffing a few bytes of header into a packet) is in
itself not nearly an expensive enough operation to warrant offloading
to the NIC. Personally, I wish if NIC vendors are going to focus on
stateful offload I rather see it be for encryption which I believe
currently does warrant offload at 40G and higher speeds.
Tom
> I'm with you though, all of this has little value on the host in
> the DC if stateful encap offload is not incorporated. I expect the
> HW to provide filters on the outer header plus metadata in the
> encap. Actually, this was a follow-up question I had for John as
> this is not easily describable with offset/len filters. How would
> we represent such capabilities?
>
> The TX side of this was one of the reasons why I initially thought
> it would be beneficial to implement a cache like offload as we could
> serve an initial encap in SW, do the FIB lookup and offload it
> transparently to avoid replicating the FIB in user space.
>
> What seems most feasisble to me right now is to separate the offload
> of the encap action from the IP -> dev mapping decision. The eSwitch
> would send the first encap for an unknown dest IP to the CPU due
> to a miss in the IP mapping table, the CPU would do the FIB lookup,
> update the table and send it back.
>
> What do you have in mind?
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists