netdev - Re: [RFC PATCH net-next 2/8] sfc: batch up RX delivery on EF10

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 19 Apr 2016 17:36:03 +0100
From:	Edward Cree <ecree@...arflare.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
CC:	<netdev@...r.kernel.org>, David Miller <davem@...emloft.net>,
	"Jesper Dangaard Brouer" <brouer@...hat.com>,
	<linux-net-drivers@...arflare.com>
Subject: Re: [RFC PATCH net-next 2/8] sfc: batch up RX delivery on EF10

On 19/04/16 15:47, Eric Dumazet wrote:
> On Tue, 2016-04-19 at 14:35 +0100, Edward Cree wrote:
>> Improves packet rate of 1-byte UDP receives by 10%.
> Sure, by adding yet another queue and extra latencies.
>
> If the switch delivered a high prio packet to your host right before a
> train of 60 low prio packets, this is not to allow us to wait the end of
> the train.
The length of the list is bounded by the NAPI budget, and the first packet
in the list is delayed only by the time it takes to read the RX descriptors
and turn them into SKBs.  This patch never causes us to wait in the hope
that more things will arrive to batch, that's entirely driven by interrupt
moderation.

And if the high prio packet comes at the _end_ of a train of low prio
packets, we get to it _faster_ this way because we get the train out of the
way quicker.

Are you suggesting we should check for 802.1p priorities, and have those
skip the list?

> We have to really invent something better, like a real pipeline, instead
> of hacks like this, adding complexity everywhere.
I'm not sure what you mean by 'a real pipeline' in this context, could you
elaborate?

> Have you tested this on cpus with tiny caches, like 32KB ?
I haven't.  Is the concern here that the first packet's headers (we read 128
bytes into the linear area) and/or skb will get pushed out of the dcache as
we process further packets?

At least for sfc, it's highly unlikely that these cards will be used in low-
powered systems.  For the more general case, I suppose the answer would be a
tunable to set the maximum length of the RX list to less than the NAPI budget.
Fundamentally this kind of batching is trading dcache usage for icache usage.


Incidentally, this patch is very similar to what Jesper proposed for mlx5 in
an RFC back in February: http://article.gmane.org/gmane.linux.network/397379
So I'm a little surprised this bit is controversial, though I'm not surprised
the rest of the series is ;)

-Ed