lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 12 Jan 2021 18:26:22 +0000 From: Alexander Lobakin <alobakin@...me> To: Eric Dumazet <edumazet@...gle.com> Cc: Alexander Lobakin <alobakin@...me>, "David S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, Edward Cree <ecree@...arflare.com>, Jonathan Lemon <jonathan.lemon@...il.com>, Willem de Bruijn <willemb@...gle.com>, Miaohe Lin <linmiaohe@...wei.com>, Steffen Klassert <steffen.klassert@...unet.com>, Guillaume Nault <gnault@...hat.com>, Yadu Kishore <kyk.segfault@...il.com>, Al Viro <viro@...iv.linux.org.uk>, netdev <netdev@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org> Subject: Re: [PATCH net-next 0/5] skbuff: introduce skbuff_heads bulking and reusing From: Eric Dumazet <edumazet@...gle.com> Date: Tue, 12 Jan 2021 13:32:56 +0100 > On Tue, Jan 12, 2021 at 11:56 AM Alexander Lobakin <alobakin@...me> wrote: >> > >> >> Ah, I should've mentioned that I use UDP GRO Fraglists, so these >> numbers are for GRO. >> > > Right, this suggests UDP GRO fraglist is a pathological case of GRO, > not saving memory. > > Real GRO (TCP in most cases) will consume one skb, and have page > fragments for each segment. > > Having skbs linked together is not cache friendly. OK, so I rebased test setup a bit to clarify the things out. I disabled fraglists and GRO/GSO fraglists support advertisement in driver to exclude any "pathological" cases and switched it from napi_get_frags() + napi_gro_frags() to napi_alloc_skb() + napi_gro_receive() to disable local skb reusing (napi_reuse_skb()). I also enabled GSO UDP L4 ("classic" one: one skbuff_head + frags) for forwarding, not only local traffic, and disabled NF flow offload to increase CPU loading and drop performance below link speed so I could see the changes. So, the traffic flows looked like: - TCP GRO (one head + frags) -> NAT -> hardware TSO; - UDP GRO (one head + frags) -> NAT -> driver-side GSO. Baseline 5.11-rc3: - 865 Mbps TCP, 866 Mbps UDP. This patch (both separate caches and Edward's unified cache): - 899 Mbps TCP, 893 Mbps UDP. So that's cleary *not* only "pathological" UDP GRO Fraglists "problem" as TCP also got ~35 Mbps from this, as well as non-fraglisted UDP. Regarding latencies: I remember there were talks about latencies when Edward introduced batched GRO (using linked lists to pass skbs from GRO layer to core stack instead of passing one by one), so I think it's a perennial question when it comes to batching/caching. Thanks for the feedback, will post v2 soon. The question about if this caching is reasonable isn't closed anyway, but I don't see significant "cons" for now. > So I would try first to make this case better, instead of trying to > work around the real issue. Al
Powered by blists - more mailing lists