netdev - Re: [PATCH net-next 0/5] qed/qede: Tunnel hardware GRO support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <9834a2dd-33c0-33b1-697e-49a39b1b7554@solarflare.com>
Date:	Fri, 24 Jun 2016 18:21:53 +0100
From:	Edward Cree <ecree@...arflare.com>
To:	Tom Herbert <tom@...bertland.com>
CC:	Alexander Duyck <alexander.duyck@...il.com>,
	Yuval Mintz <Yuval.Mintz@...gic.com>,
	Eric Dumazet <eric.dumazet@...il.com>,
	Rick Jones <rick.jones2@....com>,
	Manish Chopra <manish.chopra@...gic.com>,
	David Miller <davem@...emloft.net>,
	netdev <netdev@...r.kernel.org>,
	Ariel Elior <Ariel.Elior@...gic.com>,
	Hannes Frederic Sowa <hannes@...hat.com>,
	Bert Kenward <bkenward@...arflare.com>
Subject: Re: [PATCH net-next 0/5] qed/qede: Tunnel hardware GRO support

On 24/06/16 17:31, Tom Herbert wrote:
> Ed,
> Because you took this OT... ;-)
>
> LRO/GRO is the one of the five common offloads that has no generic
> analogue and requires protocol specific logic. For instance each
> IP-over-foo encapsulation needs kernel code for GRO, device/driver
> code for LRO. I think the answer here is to make both GRO and LRO to
> be user programmable via BPF.
I agree that the only way to make LRO generic is to go for hardware
BPF.  However, I think that's likely to cause a _lot_ of headaches to
implement and my hope is that we can instead get acceptable receive
performance from GRO, RSS, and maybe things like the skb bundling I
posted a while back.
For instance, if your 'source port hack' were to mix in the TNI as
well as the inner flow fields it already uses, I think that could
improve hash spreading and thus GRO would perform better.
Fundamentally I believe that robust, responsive hardware LRO is not
workable as the hardware would have to decide to hold onto packets in
the hope of merge candidates arriving soon after.  Whereas in the
software layer (GRO, bundling...), the packets are already coming in
bursts thanks to the way napi polling behaves.
But I'd love to be proved wrong :)  The 'hybrid' approach of using
bpf in hw to identify flows for sw to gro does seem plausible, maybe
having bpf to compute the rxhash is the answer?

-Ed

(disclaimer: definitely not speaking for my employer here, these are
my personal views only.)