lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+mtBx-mJE6SW2bqf3_t6iu=o5FY1WAF1ByeZpDJnc+6fpK6nA@mail.gmail.com>
Date:	Mon, 22 Sep 2014 19:16:47 -0700
From:	Tom Herbert <therbert@...gle.com>
To:	Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:	Thomas Graf <tgraf@...g.ch>, Jiri Pirko <jiri@...nulli.us>,
	John Fastabend <john.r.fastabend@...el.com>,
	Jamal Hadi Salim <jhs@...atatu.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"David S. Miller" <davem@...emloft.net>,
	Neil Horman <nhorman@...driver.com>,
	Andy Gospodarek <andy@...yhouse.net>,
	Daniel Borkmann <dborkman@...hat.com>,
	Or Gerlitz <ogerlitz@...lanox.com>,
	Jesse Gross <jesse@...ira.com>,
	Pravin Shelar <pshelar@...ira.com>,
	Andy Zhou <azhou@...ira.com>,
	Ben Hutchings <ben@...adent.org.uk>,
	Stephen Hemminger <stephen@...workplumber.org>,
	Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
	Vladislav Yasevich <vyasevic@...hat.com>,
	Cong Wang <xiyou.wangcong@...il.com>,
	Eric Dumazet <edumazet@...gle.com>,
	Scott Feldman <sfeldma@...ulusnetworks.com>,
	Florian Fainelli <f.fainelli@...il.com>,
	Roopa Prabhu <roopa@...ulusnetworks.com>,
	John Linville <linville@...driver.com>,
	"dev@...nvswitch.org" <dev@...nvswitch.org>,
	Jason Wang <jasowang@...hat.com>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Nicolas Dichtel <nicolas.dichtel@...nd.com>,
	ryazanov.s.a@...il.com, Lennert Buytenhek <buytenh@...tstofly.org>,
	aviadr@...lanox.com, Felix Fietkau <nbd@...nwrt.org>,
	Neil Jerram <Neil.Jerram@...aswitch.com>, ronye@...lanox.com,
	simon.horman@...ronome.com,
	Alexander Duyck <alexander.h.duyck@...el.com>
Subject: Re: [patch net-next v2 8/9] switchdev: introduce Netlink API

On Mon, Sep 22, 2014 at 6:54 PM, Alexei Starovoitov
<alexei.starovoitov@...il.com> wrote:
> On Mon, Sep 22, 2014 at 8:10 AM, Tom Herbert <therbert@...gle.com> wrote:
>> On Mon, Sep 22, 2014 at 1:13 AM, Thomas Graf <tgraf@...g.ch> wrote:
>>> On 09/20/14 at 03:50pm, Alexei Starovoitov wrote:
>>>> I think HW should not be limited by SW abstractions whether
>>>> these abstractions are called flows, n-tuples, bridge or else.
>>>> Really looking forward to see "device reporting the headers as
>>>> header fields (len, offset) and the associated parse graph"
>>>> as the first step.
>>>>
>>>> Another topic that this discussion didn't cover yet is how this
>>>> all connects to tunnels and what is 'tunnel offloading'.
>
>> encapsulation (stuffing a few bytes of header into a packet) is in
>> itself not nearly an expensive enough operation to warrant offloading
>> to the NIC. Personally, I wish if NIC vendors are going to focus on
>
> On contrary, generic tunneling is most important one to get right
> when we're talking offloads.
> Adding encap header is easy to do in hw, but it breaks all other
> offloads if hw is not generic. Consider gso packet coming from vm.
> Generic tunnel allows sw to add inner headers, outer headers and
> setup offload offsets, so that HW does segmentation, checksuming
> of inner packet, adjusts inner headers and adds final outer encap.

As I pointed out on a previous thread, we already have a sufficiently
generic interface to allow HW to do encapsulated TSO
(SKB_GSO_UDP_TUNNEL and SKB_GSO_UDP_TUNNEL_CSUM with the inner
headers). If properly implemented, HW can implement a whole bunch of
UDP encap protocols without knowing how to parse them. I don't see how
a switch on the NIC helps this...

> And this is just tx offload. On rx smart tunnel offload in HW parses
> encap and goes all the way to inner headers to verify checksums,
> it also steers based on inner headers.
> Try mellanox nics with and without vxlan offload to see
> the difference.

Turn on UDP RSS on the device and I bet you'll see those differences
go away! Once we moved to UDP encapsulation, there's really little
value in looking at inner headers for RSS or ECMP, this should be
sufficient. Sure someone might want to parse the inner headers for
some sort of advanced RX steering, but again this implies rx-filtering
and not switch functionality.

Alexei, I believe you said previously said that SW should not dictate
HW models. I agree with this, but also believe the converse is true--
HW shouldn't dictate SW model. This is really why I'm raising the
question of what it means to integrate a switch into the host stack.
If this is something that doesn't require any model change to the
stack and is just a clever backend for rx-filters or tc, then I'm fine
with that!

Thanks,
Tom

> It looks like fm10k will be just as good, but existing encaps are
> not going to last forever, so RX should be improved they way John
> is saying. There gotta to be a 'parse graph' for HW to see past
> variable length encap and into inner headers.
> checksum_complete style of offloading checksum verification
> is not efficient. The cost of adjusting it over and over while
> parsing encaps is too high. Plus cpu steering based on outer
> headers is just too slow when speeds are in 40G range.
>
>> stateful offload I rather see it be for encryption which I believe
>> currently does warrant offload at 40G and higher speeds.
>
> encryption offload is badly needed as well. Unfortunately it's
> not seen as nic feature yet.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ