[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f93a9ceccb0464bb4cc0a2c2be85c0a20e11b11d.camel@mellanox.com>
Date: Fri, 21 Dec 2018 02:26:13 +0000
From: Saeed Mahameed <saeedm@...lanox.com>
To: "jonathan.lemon@...il.com" <jonathan.lemon@...il.com>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"brouer@...hat.com" <brouer@...hat.com>
Subject: Re: [PATCH net-next] net: Don't return pfmemalloc pages to the page
pool.
On Fri, 2018-12-21 at 02:13 +0000, Saeed Mahameed wrote:
> On Thu, 2018-12-20 at 16:56 -0800, Jonathan Lemon wrote:
> > On 20 Dec 2018, at 15:41, Saeed Mahameed wrote:
> >
> > > On Thu, 2018-12-20 at 14:11 -0800, Jonathan Lemon wrote:
> > > > (Resending due to mailer issues)
> > > >
> > > > On 20 Dec 2018, at 5:03, Jesper Dangaard Brouer wrote:
> > > >
> > > > > On Wed, 19 Dec 2018 12:06:51 -0800
> > > > > Jonathan Lemon <jonathan.lemon@...il.com> wrote:
> > > > >
> > > > > > Return pfmemalloc pages back to the page allocator, instead
> > > > > > of
> > > > > > holding them in the page pool.
> > > > >
> > > > > Have you experience this issue in practice or is it theory?
> > > >
> > > > We're seeing the mlx5 driver use pfmemalloc pages with 4.11,
> > > > and
> > > > then
> > > > return them
> > > > back to the page allocator. (it's triggering the
> > > > mlx5e_page_is_reserved() test).
> > > > The page pool code isn't in production use, but the code paths
> > > > appear
> > > > identical.
> > > >
> > > >
> > > > > > While here, also use the __page_pool_return_page() API.
> > > > >
> > > > > Don't combine several unrelated changed in one patch.
> > > >
> > > > Okay - will send as 2 separate patches
> > > >
> > > >
> > > > > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > > > > index 43a932cb609b..091007ff14a3 100644
> > > > > > --- a/net/core/page_pool.c
> > > > > > +++ b/net/core/page_pool.c
> > > > > > @@ -233,7 +233,7 @@ void __page_pool_put_page(struct
> > > > > > page_pool
> > > > > > *pool,
> > > > > > *
> > > > > > * refcnt == 1 means page_pool owns page, and can
> > > > > > recycle it.
> > > > > > */
> > > > > > - if (likely(page_ref_count(page) == 1)) {
> > > > > > + if (likely(page_ref_count(page) == 1 &&
> > > > > > !page_is_pfmemalloc(page)))
> > > > > > {
> > >
> > > I think this is wrong, if refcount is 1, then this page belongs
> > > to
> > > pagepool and you must enter this statement's true block, and test
> > > page_is_pfmemalloc inside (mark it unlikely),
> > > to return a pfmemalloc page, you need to call
> > > __page_pool_return_page()
> > > to dma_unmap and other cleanups if required.
> >
> > If the page belongs to the pool, but it isn't recyclable, the
> > fallback
> > path is "__page_pool_return_page()". If it doesn't belong to the
> > pool,
> > it goes to the non-XDP mode, which is also
> > __page_pool_return_page().
> >
>
> non-XDP mode doesn't explicitly call __page_pool_return_page() but it
> does exactly what __page_pool_return_page() is doing now, which is:
> __page_pool_clean_page(pool, page);
> put_page(page);
>
> your code is logically correct for now, but semantically speaking
> __page_pool_return_page might change in the future to have special
> handling of pages that actually belong only to the pagepool
> (refcount==1), so better be safe and handle such pages in the
> pagepool
> path and not in the non-XDP path
>
> all you need is:
> if(unlikely(!page_is_pfmemalloc(page))
> __page_pool_return_page()
>
Maybe something like:
@@ -236,6 +236,12 @@ void __page_pool_put_page(struct page_pool *pool,
if (likely(page_ref_count(page) == 1)) {
/* Read barrier done in page_ref_count / READ_ONCE */
+ /* Don't recycle emergency pages */
+ if(unlikely(page_is_pfmemalloc(page)) {
+ __page_pool_return_page(pool, page)
+ return;
+ }
+
** Not tested :) **
> early inside if (likely(page_ref_count(page) == 1)) pagepool
> recycling
> block; I don't think this will affect performance.
>
> > Does your dislike stem from the fact that the "non-XDP" mode is
> > taken
> > for the "refcount=1, pfmemalloc=T" case?
Powered by blists - more mailing lists