lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <1de97f3a-ae56-8cdf-4677-ceb36bdc336d@intel.com> Date: Thu, 16 Feb 2023 13:04:37 +0100 From: Alexander Lobakin <aleksander.lobakin@...el.com> To: Jakub Kicinski <kuba@...nel.org> CC: Edward Cree <ecree.xilinx@...il.com>, <davem@...emloft.net>, <netdev@...r.kernel.org>, <edumazet@...gle.com>, <pabeni@...hat.com>, <willemb@...gle.com>, <fw@...len.de> Subject: Re: [PATCH net-next 2/3] net: skbuff: cache one skb_ext for use by GRO From: Jakub Kicinski <kuba@...nel.org> Date: Wed, 15 Feb 2023 10:20:15 -0800 > On Wed, 15 Feb 2023 19:01:19 +0100 Alexander Lobakin wrote: >>> I was hoping to leave sizing of the cache until we have some data from >>> a production network (or at least representative packet traces). >>> >>> NAPI_SKB_CACHE_SIZE kinda assumes we're not doing much GRO, right? >> >> It assumes we GRO a lot :D >> >> Imagine that you have 64 frames during one poll and the GRO layer >> decides to coalesce them by batches of 16. Then only 4 skbs will be >> used, the rest will go as frags (with "stolen heads") -> 60 of 64 skbs >> will return to that skb cache and will then be reused by napi_build_skb(). > > Let's say 5 - for 4 resulting skbs GRO will need the 4 resulting and > one extra to shuttle between the driver and GRO (worst case). > With a cache of 1 I'm guaranteed to save 59 alloc calls, 92%, right? > > That's why I'm saying - the larger cache would help workloads which > don't GRO as much. Am I missing the point or how GRO works? Maybe I'm missing something now :D The driver receives 5 frames, so it allocates 5 skbs. GRO coalesces them into one big, so the first one remains as an skb, the following 4 get their data added as frags and then are moved to the NAPI cache (%NAPI_GRO_FREE_STOLEN_HEAD). After GRO decides it's enough for this skb, it gets moved to the pending list to be flushed soon. @gro_normal_batch is usually 8, so it means there can be up to 8.... Oh wait, Eric changed this to count segments, not skbs :D ...there can be up to 2* such skbs waiting for a flush (the first one sets the counter to 5, the second adds 5 more => flush happens). So you anyway would need at least 2* skb extensions cached, otherwise there will be new allocations. This is not counting fraglists, when GRO decides to fraglist an skb, it requires at least 1 skb more. UDP fraglisted GRO (I know almost nobody uses it, still it does exist) doesn't use frags at all and requires 1 skb per each segment. You're right that the cache size of %NAPI_POLL_WEIGHT is needed only for corner cases like big @gro_normal_batch, fraglists, UDP fraglisted GRO and so on, still think we shouldn't ignore them :) Also this cache can then be reused later to bulk-free extensions on Tx completion, just like it's done for skbs. * or less/more if customized by user, for example I set 16 on MIPS, x86_64 works better with 8. Thanks, Olek
Powered by blists - more mailing lists