lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YJTm4uhvqCy2lJH8@apalos.home>
Date:   Fri, 7 May 2021 10:06:10 +0300
From:   Ilias Apalodimas <ilias.apalodimas@...aro.org>
To:     Yunsheng Lin <linyunsheng@...wei.com>
Cc:     Matteo Croce <mcroce@...ux.microsoft.com>, netdev@...r.kernel.org,
        linux-mm@...ck.org, Ayush Sawal <ayush.sawal@...lsio.com>,
        Vinay Kumar Yadav <vinay.yadav@...lsio.com>,
        Rohit Maheshwari <rohitm@...lsio.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Thomas Petazzoni <thomas.petazzoni@...tlin.com>,
        Marcin Wojtas <mw@...ihalf.com>,
        Russell King <linux@...linux.org.uk>,
        Mirko Lindner <mlindner@...vell.com>,
        Stephen Hemminger <stephen@...workplumber.org>,
        Tariq Toukan <tariqt@...dia.com>,
        Jesper Dangaard Brouer <hawk@...nel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        John Fastabend <john.fastabend@...il.com>,
        Boris Pismenny <borisp@...dia.com>,
        Arnd Bergmann <arnd@...db.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        Vlastimil Babka <vbabka@...e.cz>, Yu Zhao <yuzhao@...gle.com>,
        Will Deacon <will@...nel.org>,
        Michel Lespinasse <walken@...gle.com>,
        Fenghua Yu <fenghua.yu@...el.com>,
        Roman Gushchin <guro@...com>, Hugh Dickins <hughd@...gle.com>,
        Peter Xu <peterx@...hat.com>, Jason Gunthorpe <jgg@...pe.ca>,
        Guoqing Jiang <guoqing.jiang@...ud.ionos.com>,
        Jonathan Lemon <jonathan.lemon@...il.com>,
        Alexander Lobakin <alobakin@...me>,
        Cong Wang <cong.wang@...edance.com>, wenxu <wenxu@...oud.cn>,
        Kevin Hao <haokexin@...il.com>,
        Aleksandr Nogikh <nogikh@...gle.com>,
        Jakub Sitnicki <jakub@...udflare.com>,
        Marco Elver <elver@...gle.com>,
        Willem de Bruijn <willemb@...gle.com>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Guillaume Nault <gnault@...hat.com>,
        linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org,
        bpf@...r.kernel.org, Matthew Wilcox <willy@...radead.org>,
        Eric Dumazet <edumazet@...gle.com>,
        David Ahern <dsahern@...il.com>,
        Lorenzo Bianconi <lorenzo@...nel.org>,
        Saeed Mahameed <saeedm@...dia.com>,
        Andrew Lunn <andrew@...n.ch>, Paolo Abeni <pabeni@...hat.com>
Subject: Re: [PATCH net-next v3 0/5] page_pool: recycle buffers

On Fri, May 07, 2021 at 11:23:28AM +0800, Yunsheng Lin wrote:
> On 2021/5/6 20:58, Ilias Apalodimas wrote:
> >>>>
> >>>
> >>> Not really, the opposite is happening here. If the pp_recycle bit is set we
> >>> will always call page_pool_return_skb_page().  If the page signature matches
> >>> the 'magic' set by page pool we will always call xdp_return_skb_frame() will
> >>> end up calling __page_pool_put_page(). If the refcnt is 1 we'll try
> >>> to recycle the page.  If it's not we'll release it from page_pool (releasing
> >>> some internal references we keep) unmap the buffer and decrement the refcnt.
> >>
> >> Yes, I understood the above is what the page pool do now.
> >>
> >> But the question is who is still holding an extral reference to the page when
> >> kfree_skb()? Perhaps a cloned and pskb_expand_head()'ed skb is holding an extral
> >> reference to the same page? So why not just do a page_ref_dec() if the orginal skb
> >> is freed first, and call __page_pool_put_page() when the cloned skb is freed later?
> >> So that we can always reuse the recyclable page from a recyclable skb. This may
> >> make the page_pool_destroy() process delays longer than before, I am supposed the
> >> page_pool_destroy() delaying for cloned skb case does not really matters here.
> >>
> >> If the above works, I think the samiliar handling can be added to RX zerocopy if
> >> the RX zerocopy also hold extral references to the recyclable page from a recyclable
> >> skb too?
> >>
> > 
> > Right, this sounds doable, but I'll have to go back code it and see if it
> > really makes sense.  However I'd still prefer the support to go in as-is
> > (including the struct xdp_mem_info in struct page, instead of a page_pool
> > pointer).
> > 
> > There's a couple of reasons for that.  If we keep the struct xdp_mem_info we
> > can in the future recycle different kind of buffers using __xdp_return().
> > And this is a non intrusive change if we choose to store the page pool address
> > directly in the future.  It just affects the internal contract between the
> > page_pool code and struct page.  So it won't affect any drivers that already
> > use the feature.
> 
> This patchset has embeded a signature field in "struct page", and xdp_mem_info
> is stored in page_private(), which seems not considering the case for associating
> the page pool with "struct page" directly yet? 

Correct

> Is the page pool also stored in
> page_private() and a different signature is used to indicate that?

No only struct xdp_mem_info as you mentioned before

> 
> I am not saying we have to do it in this patchset, but we have to consider it
> while we are adding new signature field to "struct page", right?

We won't need a new signature.  The signature in both cases is there to 
guarantee the page you are trying to recycle was indeed allocated by page_pool.

Basically we got two design choices here: 
- We store the page_pool ptr address directly in page->private and then,
  we call into page_pool APIs directly to do the recycling.
  That would eliminate the lookup through xdp_mem_info and the
  XDP helpers to locate page pool pointer (through __xdp_return).
- You store the xdp_mem_info on page_private.  In that case you need to go
  through __xdp_return()  to locate the page_pool pointer. Although we might
  loose some performance that would allow us to recycle additional memory types
  and not only MEM_TYPE_PAGE_POOL (in case we ever need it).


I think both choices are sane.  What I am trying to explain here, is
regardless of what we choose now, we can change it in the future without
affecting the API consumers at all.  What will change internally is the way we
lookup the page pool pointer we are trying to recycle.

[...]


Cheers
/Ilias

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ