[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191110085939.23013f83@carbon>
Date: Sun, 10 Nov 2019 08:59:39 +0100
From: Jesper Dangaard Brouer <brouer@...hat.com>
To: "Jonathan Lemon" <jonathan.lemon@...il.com>
Cc: "Toke Høiland-Jørgensen" <toke@...hat.com>,
netdev@...r.kernel.org,
"Ilias Apalodimas" <ilias.apalodimas@...aro.org>,
"Saeed Mahameed" <saeedm@...lanox.com>,
"Matteo Croce" <mcroce@...hat.com>,
"Lorenzo Bianconi" <lorenzo@...nel.org>,
"Tariq Toukan" <tariqt@...lanox.com>, brouer@...hat.com
Subject: Re: [net-next v1 PATCH 1/2] xdp: revert forced mem allocator
removal for page_pool
On Sat, 09 Nov 2019 09:34:50 -0800
"Jonathan Lemon" <jonathan.lemon@...il.com> wrote:
> On 9 Nov 2019, at 8:11, Jesper Dangaard Brouer wrote:
>
> > On Fri, 08 Nov 2019 11:16:43 -0800
> > "Jonathan Lemon" <jonathan.lemon@...il.com> wrote:
> >
> >>> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> >>> index 5bc65587f1c4..226f2eb30418 100644
> >>> --- a/net/core/page_pool.c
> >>> +++ b/net/core/page_pool.c
> >>> @@ -346,7 +346,7 @@ static void __warn_in_flight(struct page_pool
> >>> *pool)
> >>>
> >>> distance = _distance(hold_cnt, release_cnt);
> >>>
> >>> - /* Drivers should fix this, but only problematic when DMA is used */
> >>> + /* BUG but warn as kernel should crash later */
> >>> WARN(1, "Still in-flight pages:%d hold:%u released:%u",
> >>> distance, hold_cnt, release_cnt);
> >
> > Because this is kept as a WARN, I set pool->ring.queue = NULL later.
>
> ... which is also an API violation, reaching into the ring internals.
> I strongly dislike this.
I understand your dislike of reaching into ptr_ring "internals".
But my plan was to add this here, and then in a followup patch move this
pool->ring.queue=NULL into the ptr_ring.
> >>> }
> >>> @@ -360,12 +360,16 @@ void __page_pool_free(struct page_pool *pool)
> >>> WARN(pool->alloc.count, "API usage violation");
> >>> WARN(!ptr_ring_empty(&pool->ring), "ptr_ring is not empty");
> >>>
> >>> - /* Can happen due to forced shutdown */
> >>> if (!__page_pool_safe_to_destroy(pool))
> >>> __warn_in_flight(pool);
> >>
> >> If it's not safe to destroy, we shouldn't be getting here.
> >
> > Don't make such assumptions. The API is going to be used by driver
> > developer and they are always a little too creative...
>
> If the driver hits this case, the driver has a bug, and it isn't
> safe to continue in any fashion. The developer needs to fix their
> driver in that case. (see stmmac code)
The stmmac driver is NOT broken, they simply use page_pool as their
driver level page-cache. That is exactly what page_pool was designed
for, creating a generic page-cache for drivers to use. They use this
to simplify their driver. They don't use the advanced features, which
requires hooking into mem model reg.
>
> > The page_pool is a separate facility, it is not tied to the
> > xdp_rxq_info memory model. Some drivers use page_pool directly e.g.
> > drivers/net/ethernet/stmicro/stmmac. It can easily trigger this case,
> > when some extend that driver.
>
> Yes, and I pointed out that the mem_info should likely be completely
> detached from xdp.c since it really has nothing to do with XDP.
> The stmmac driver is actually broken at the moment, as it tries to
> free the pool immediately without a timeout.
>
> What should be happening is that drivers just call page_pool_destroy(),
> which kicks off the shutdown process if this was the last user ref,
> and delays destruction if packets are in flight.
Sorry, but I'm getting frustrated with you. I've already explained you
(offlist), that the memory model reg/unreg system have been created to
support multiple memory models (even per RX-queue). We already have
AF_XDP zero copy, but I actually want to keep the flexibility and add
more in the future.
> >>> ptr_ring_cleanup(&pool->ring, NULL);
> >>>
> >>> + /* Make sure kernel will crash on use-after-free */
> >>> + pool->ring.queue = NULL;
> >>> + pool->alloc.cache[PP_ALLOC_CACHE_SIZE - 1] = NULL;
> >>> + pool->alloc.count = PP_ALLOC_CACHE_SIZE;
> >>
> >> The pool is going to be freed. This is useless code; if we're
> >> really concerned about use-after-free, the correct place for catching
> >> this is with the memory-allocator tools, not scattering things like
> >> this ad-hoc over the codebase.
> >
> > No, I need this code here, because we kept the above WARN() and didn't
> > change that into a BUG(). It is obviously not a full solution for
> > use-after-free detection. The memory subsystem have kmemleak to catch
> > this kind of stuff, but nobody runs this in production. I need this
> > here to catch some obvious runtime cases.
>
> The WARN() indicates something went off the rails already. I really
> don't like half-assed solutions like the above; it may or may not work
> properly. If it doesn't work properly, then what's the point?
So, you are suggesting to use BUG_ON() instead and crash the kernel
immediately... you do know Linus hates when we do that, right?
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
Powered by blists - more mailing lists