[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YJvopUsZHcGb7q24@apalos.home>
Date: Wed, 12 May 2021 17:39:33 +0300
From: Ilias Apalodimas <ilias.apalodimas@...aro.org>
To: Eric Dumazet <edumazet@...gle.com>
Cc: Eric Dumazet <eric.dumazet@...il.com>,
Matteo Croce <mcroce@...ux.microsoft.com>,
netdev <netdev@...r.kernel.org>, linux-mm <linux-mm@...ck.org>,
Ayush Sawal <ayush.sawal@...lsio.com>,
Vinay Kumar Yadav <vinay.yadav@...lsio.com>,
Rohit Maheshwari <rohitm@...lsio.com>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Thomas Petazzoni <thomas.petazzoni@...tlin.com>,
Marcin Wojtas <mw@...ihalf.com>,
Russell King <linux@...linux.org.uk>,
Mirko Lindner <mlindner@...vell.com>,
Stephen Hemminger <stephen@...workplumber.org>,
Tariq Toukan <tariqt@...dia.com>,
Jesper Dangaard Brouer <hawk@...nel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
John Fastabend <john.fastabend@...il.com>,
Boris Pismenny <borisp@...dia.com>,
Arnd Bergmann <arnd@...db.de>,
Andrew Morton <akpm@...ux-foundation.org>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Vlastimil Babka <vbabka@...e.cz>, Yu Zhao <yuzhao@...gle.com>,
Will Deacon <will@...nel.org>,
Michel Lespinasse <walken@...gle.com>,
Fenghua Yu <fenghua.yu@...el.com>,
Roman Gushchin <guro@...com>, Hugh Dickins <hughd@...gle.com>,
Peter Xu <peterx@...hat.com>, Jason Gunthorpe <jgg@...pe.ca>,
Jonathan Lemon <jonathan.lemon@...il.com>,
Alexander Lobakin <alobakin@...me>,
Cong Wang <cong.wang@...edance.com>, wenxu <wenxu@...oud.cn>,
Kevin Hao <haokexin@...il.com>,
Jakub Sitnicki <jakub@...udflare.com>,
Marco Elver <elver@...gle.com>,
Willem de Bruijn <willemb@...gle.com>,
Miaohe Lin <linmiaohe@...wei.com>,
Yunsheng Lin <linyunsheng@...wei.com>,
Guillaume Nault <gnault@...hat.com>,
LKML <linux-kernel@...r.kernel.org>,
linux-rdma <linux-rdma@...r.kernel.org>,
bpf <bpf@...r.kernel.org>, Matthew Wilcox <willy@...radead.org>,
David Ahern <dsahern@...il.com>,
Lorenzo Bianconi <lorenzo@...nel.org>,
Saeed Mahameed <saeedm@...dia.com>,
Andrew Lunn <andrew@...n.ch>, Paolo Abeni <pabeni@...hat.com>,
Sven Auhagen <sven.auhagen@...eatech.de>
Subject: Re: [PATCH net-next v4 2/4] page_pool: Allow drivers to hint on SKB
recycling
Hi Eric,
[...]
> > > > + if (skb->pp_recycle && page_pool_return_skb_page(head))
> > >
> > > This probably should be attempted only in the (skb->head_frag) case ?
> >
> > I think the extra check makes sense.
>
> What do you mean here ?
>
I thought you wanted an extra check in the if statement above. So move the
block under the existing if. Something like
if (skb->head_frag) {
#ifdef (CONFIG_PAGE_POOL)
if (skb->pp_recycle && page_pool_return_skb_page(head))
return;
#endif
skb_free_frag(head);
} else {
.....
> >
> > >
> > > Also this patch misses pskb_expand_head()
> >
> > I am not sure I am following. Misses what? pskb_expand_head() will either
> > call skb_release_data() or skb_free_head(), which would either recycle or
> > unmap the buffer for us (depending on the page refcnt)
>
> pskb_expand_head() allocates a new skb->head, from slab.
>
> We should clear skb->pp_recycle for consistency of the skb->head_frag
> clearing we perform there.
Ah right, good catch. I was mostly worried we are not freeing/unmapping
buffers and I completely missed that. I think nothing bad will happen even
if we don't, since the signature will eventually protect us, but it's
definitely the right thing to do.
>
> But then, I now realize you use skb->pp_recycle bit for both skb->head
> and fragments,
> and rely on this PP_SIGNATURE thing (I note that patch 1 changelog
> does not describe why a random page will _not_ have this signature by
> bad luck)
Correct. I've tried to explain in the previous posting as well, but that's
the big difference compared to the initial RFC we sent a few years ago (the
ability to recycle frags as well).
>
> Please document/describe which struct page fields are aliased with
> page->signature ?
>
Sure, any preference on this? Right above page_pool_return_skb_page() ?
Keep in mind the current [1/4] patch is wrong, since it will overlap
pp_signature with mapping. So we'll have interesting results if a page
gets mapped to userspace :).
What Matthew proposed makes sense, we can add something along the lines of:
+ unsigned long pp_magic;
+ struct page_pool *pp;
+ unsigned long _pp_mapping_pad;
+ unsigned long dma_addr[2];
in struct page. In this case page->mapping aliases to pa->_pp_mapping_pad
The first word (that we'll now be using) is used for a pointer or a
compound_head. So as long as pp_magic doesn't resemble a pointer and has
bits 0/1 set to 0 we should be safe.
Thanks!
/Ilias
Powered by blists - more mailing lists