lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <efa64c44b1e53510f972954e4670b2d8ecde97eb.camel@redhat.com> Date: Fri, 23 Sep 2022 09:41:29 +0200 From: Paolo Abeni <pabeni@...hat.com> To: Alexander Duyck <alexander.duyck@...il.com> Cc: netdev@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>, "David S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org> Subject: Re: [PATCH net-next v2] net: skb: introduce and use a single page frag cache Hello, On Thu, 2022-09-22 at 14:17 -0700, Alexander Duyck wrote: [...] > My suggestion earlier was to just make the 1k cache a page_frag_cache. > It will allow you to reuse the same structure members and a single > pointer to track them. Waste would be minimal since the only real > difference between the structures is about 8B for the structure, and > odds are the napi_alloc_cache allocation is being rounded up anyway. > > > unsigned int skb_count; > > void *skb_cache[NAPI_SKB_CACHE_SIZE]; > > }; > > @@ -143,6 +202,23 @@ struct napi_alloc_cache { > > static DEFINE_PER_CPU(struct page_frag_cache, netdev_alloc_cache); > > static DEFINE_PER_CPU(struct napi_alloc_cache, napi_alloc_cache); > > > > +/* Double check that napi_get_frags() allocates skbs with > > + * skb->head being backed by slab, not a page fragment. > > + * This is to make sure bug fixed in 3226b158e67c > > + * ("net: avoid 32 x truesize under-estimation for tiny skbs") > > + * does not accidentally come back. > > + */ > > +void napi_get_frags_check(struct napi_struct *napi) > > +{ > > + struct sk_buff *skb; > > + > > + local_bh_disable(); > > + skb = napi_get_frags(napi); > > + WARN_ON_ONCE(!NAPI_HAS_SMALL_PAGE_FRAG && skb && skb->head_frag); > > + napi_free_frags(napi); > > + local_bh_enable(); > > +} > > + > > void *__napi_alloc_frag_align(unsigned int fragsz, unsigned int align_mask) > > { > > struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); > > @@ -561,6 +637,7 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, > > { > > struct napi_alloc_cache *nc; > > struct sk_buff *skb; > > + bool pfmemalloc; > > Rather than adding this I think you would be better off adding a > struct page_frag_cache pointer. I will reference it here as "pfc". > > > void *data; > > > > DEBUG_NET_WARN_ON_ONCE(!in_softirq()); > > @@ -568,8 +645,10 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, > > > > /* If requested length is either too small or too big, > > * we use kmalloc() for skb->head allocation. > > + * When the small frag allocator is available, prefer it over kmalloc > > + * for small fragments > > */ > > - if (len <= SKB_WITH_OVERHEAD(1024) || > > + if ((!NAPI_HAS_SMALL_PAGE_FRAG && len <= SKB_WITH_OVERHEAD(1024)) || > > len > SKB_WITH_OVERHEAD(PAGE_SIZE) || > > (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA))) { > > skb = __alloc_skb(len, gfp_mask, SKB_ALLOC_RX | SKB_ALLOC_NAPI, > > @@ -580,13 +659,30 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, > > } > > > > nc = this_cpu_ptr(&napi_alloc_cache); > > - len += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > > - len = SKB_DATA_ALIGN(len); > > > > if (sk_memalloc_socks()) > > gfp_mask |= __GFP_MEMALLOC; > > > > - data = page_frag_alloc(&nc->page, len, gfp_mask); > > + if (NAPI_HAS_SMALL_PAGE_FRAG && len <= SKB_WITH_OVERHEAD(1024)) { > > Then here you would add a line that would be: > pfc = &nc->page_small; > > > + /* we are artificially inflating the allocation size, but > > + * that is not as bad as it may look like, as: > > + * - 'len' less then GRO_MAX_HEAD makes little sense > > + * - larger 'len' values lead to fragment size above 512 bytes > > + * as per NAPI_HAS_SMALL_PAGE_FRAG definition > > + * - kmalloc would use the kmalloc-1k slab for such values > > + */ > > + len = SZ_1K; > > + > > + data = page_frag_alloc_1k(&nc->page_small, gfp_mask); > > + pfmemalloc = nc->page_small.pfmemalloc; > > Instead of setting pfmemalloc you could just update the line below. In > addition you would just be passing pfc as the parameter. > > > + } else { > > Likewise here you would have the line: > pfc = &nc->page; Probaly I was not clear in my previois email, sorry: before posting this version I tried locally exactly all the above, and the generated code with gcc 11.3.1 is a little bigger (a couple of instructions) than what this version produces (with gcc 11.3.1-2). It has the same number of conditionals and a slightly larger napi_alloc_cache. Additionally the suggested alternative needs more pre-processor conditionals to handle the !NAPI_HAS_SMALL_PAGE_FRAG case - avoiding adding a 2nd, unused in that case, page_frag_cache. [...] > > > @@ -596,7 +692,7 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, > > return NULL; > > } > > > > - if (nc->page.pfmemalloc) > > + if (pfmemalloc) > > Instead of passing pfmemalloc you could just check pfc->pfmemalloc. > Alternatively I wonder if it wouldn't be faster to just set the value > directly based on frag_cache->pfmemalloc and avoid the conditional > heck entirely. Note that: skb->pfmemalloc = pfmemalloc; avoids a branch but requires more instructions than the current code (verified with gcc). The gain looks doubtful?!? Additionally we have statement alike: if (<pfmemalloc>) skb->pfmemalloc = 1; in other functions in skbuff.c - still fast-path - and would be better updating all the places together for consistency - if that is really considered an improvement. IMHO it should at least land in a different patch. I'll post a v3 with your updated email address, but I think the current code is the better option. Cheers, Paolo
Powered by blists - more mailing lists