lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <20210211152620.3339-1-alobakin@pm.me> Date: Thu, 11 Feb 2021 15:26:46 +0000 From: Alexander Lobakin <alobakin@...me> To: Paolo Abeni <pabeni@...hat.com> Cc: Alexander Lobakin <alobakin@...me>, "David S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, Jonathan Lemon <jonathan.lemon@...il.com>, Eric Dumazet <edumazet@...gle.com>, Dmitry Vyukov <dvyukov@...gle.com>, Willem de Bruijn <willemb@...gle.com>, Randy Dunlap <rdunlap@...radead.org>, Kevin Hao <haokexin@...il.com>, Pablo Neira Ayuso <pablo@...filter.org>, Jakub Sitnicki <jakub@...udflare.com>, Marco Elver <elver@...gle.com>, Dexuan Cui <decui@...rosoft.com>, Jesper Dangaard Brouer <brouer@...hat.com>, Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, Andrii Nakryiko <andriin@...com>, Taehee Yoo <ap420073@...il.com>, Cong Wang <xiyou.wangcong@...il.com>, Björn Töpel <bjorn@...nel.org>, Miaohe Lin <linmiaohe@...wei.com>, Guillaume Nault <gnault@...hat.com>, Yonghong Song <yhs@...com>, zhudi <zhudi21@...wei.com>, Michal Kubecek <mkubecek@...e.cz>, Marcelo Ricardo Leitner <marcelo.leitner@...il.com>, Dmitry Safonov <0x7f454c46@...il.com>, Yang Yingliang <yangyingliang@...wei.com>, Florian Westphal <fw@...len.de>, Edward Cree <ecree.xilinx@...il.com>, linux-kernel@...r.kernel.org, netdev@...r.kernel.org Subject: Re: [PATCH v4 net-next 09/11] skbuff: allow to optionally use NAPI cache from __alloc_skb() From: Paolo Abeni <pabeni@...hat.com> Date: Thu, 11 Feb 2021 15:55:04 +0100 > On Thu, 2021-02-11 at 14:28 +0000, Alexander Lobakin wrote: > > From: Paolo Abeni <pabeni@...hat.com> on Thu, 11 Feb 2021 11:16:40 +0100 wrote: > > > What about changing __napi_alloc_skb() to always use > > > the __napi_build_skb(), for both kmalloc and page backed skbs? That is, > > > always doing the 'data' allocation in __napi_alloc_skb() - either via > > > page_frag or via kmalloc() - and than call __napi_build_skb(). > > > > > > I think that should avoid adding more checks in __alloc_skb() and > > > should probably reduce the number of conditional used > > > by __napi_alloc_skb(). > > > > I thought of this too. But this will introduce conditional branch > > to set or not skb->head_frag. So one branch less in __alloc_skb(), > > one branch more here, and we also lose the ability to __alloc_skb() > > with decached head. > > Just to try to be clear, I mean something alike the following (not even > build tested). In the fast path it has less branches than the current > code - for both kmalloc and page_frag allocation. > > --- > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > index 785daff48030..a242fbe4730e 100644 > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -506,23 +506,12 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, > gfp_t gfp_mask) > { > struct napi_alloc_cache *nc; > + bool head_frag, pfmemalloc; > struct sk_buff *skb; > void *data; > > len += NET_SKB_PAD + NET_IP_ALIGN; > > - /* If requested length is either too small or too big, > - * we use kmalloc() for skb->head allocation. > - */ > - if (len <= SKB_WITH_OVERHEAD(1024) || > - len > SKB_WITH_OVERHEAD(PAGE_SIZE) || > - (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA))) { > - skb = __alloc_skb(len, gfp_mask, SKB_ALLOC_RX, NUMA_NO_NODE); > - if (!skb) > - goto skb_fail; > - goto skb_success; > - } > - > nc = this_cpu_ptr(&napi_alloc_cache); > len += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > len = SKB_DATA_ALIGN(len); > @@ -530,25 +519,34 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, > if (sk_memalloc_socks()) > gfp_mask |= __GFP_MEMALLOC; > > - data = page_frag_alloc(&nc->page, len, gfp_mask); > + if (len <= SKB_WITH_OVERHEAD(1024) || > + len > SKB_WITH_OVERHEAD(PAGE_SIZE) || > + (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA))) { > + data = kmalloc_reserve(len, gfp_mask, NUMA_NO_NODE, &pfmemalloc); > + head_frag = 0; > + len = 0; > + } else { > + data = page_frag_alloc(&nc->page, len, gfp_mask); > + pfmemalloc = nc->page.pfmemalloc; > + head_frag = 1; > + } > if (unlikely(!data)) > return NULL; Sure. I have a separate WIP series that reworks all three *alloc_skb() functions, as there's a nice room for optimization, especially after that tiny skbs now fall back to __alloc_skb(). It will likely hit mailing lists after the merge window and next net-next season, not now. And it's not really connected with NAPI cache reusing. > skb = __build_skb(data, len); > if (unlikely(!skb)) { > - skb_free_frag(data); > + if (head_frag) > + skb_free_frag(data); > + else > + kfree(data); > return NULL; > } > > - if (nc->page.pfmemalloc) > - skb->pfmemalloc = 1; > - skb->head_frag = 1; > + skb->pfmemalloc = pfmemalloc; > + skb->head_frag = head_frag; > > -skb_success: > skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN); > skb->dev = napi->dev; > - > -skb_fail: > return skb; > } > EXPORT_SYMBOL(__napi_alloc_skb); Al
Powered by blists - more mailing lists