[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250815075929.6a19662d@kernel.org>
Date: Fri, 15 Aug 2025 07:59:29 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Jesper Dangaard Brouer <hawk@...nel.org>
Cc: Dragos Tatulea <dtatulea@...dia.com>, Chris Arges
<carges@...udflare.com>, Jesse Brandeburg <jbrandeburg@...udflare.com>,
netdev@...r.kernel.org, bpf@...r.kernel.org, kernel-team
<kernel-team@...udflare.com>, tariqt@...dia.com, saeedm@...dia.com, Leon
Romanovsky <leon@...nel.org>, Andrew Lunn <andrew+netdev@...n.ch>, "David
S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Paolo
Abeni <pabeni@...hat.com>, Alexei Starovoitov <ast@...nel.org>, Daniel
Borkmann <daniel@...earbox.net>, John Fastabend <john.fastabend@...il.com>,
Simon Horman <horms@...nel.org>, Andrew Rzeznik <arzeznik@...udflare.com>,
Yan Zhai <yan@...udflare.com>, Kumar Kartikeya Dwivedi <memxor@...il.com>
Subject: Re: [BUG] mlx5_core memory management issue
On Thu, 14 Aug 2025 17:58:21 +0200 Jesper Dangaard Brouer wrote:
> Found-by: Dragos Tatulea <dtatulea@...dia.com>
ENOSUCHTAG?
> Reported-by: Chris Arges <carges@...udflare.com>
> >> The XDP code have evolved since the xdp_set_return_frame_no_direct()
> >> calls were added. Now page_pool keeps track of pp->napi and
> >> pool-> cpuid. Maybe the __xdp_return [1] checks should be updated?
> >> (and maybe it allows us to remove the no_direct helpers).
> >>
> > So you mean to drop the napi_direct flag in __xdp_return and let
> > page_pool_put_unrefed_netmem() decide if direct should be used by
> > page_pool_napi_local()?
>
> Yes, something like that, but I would like Kuba/Jakub's input, as IIRC
> he introduced the page_pool->cpuid and page_pool->napi.
>
> There are some corner-cases we need to consider if they are valid. If
> cpumap get redirected to the *same* CPU as "previous" NAPI instance,
> which then makes page_pool->cpuid match, is it then still valid to do
> "direct" return(?).
I think/hope so, but it depends on xdp_return only being called from
softirq context.. Since softirqs can't nest if producer and consumer
of the page pool pages are on the same CPU they can't race.
I'm slightly worried that drivers which don't have dedicated Tx XDP
rings will clean it up from hard IRQ when netpoll calls. But that'd
be a bug, right? We don't allow XDP processing from IRQ context.
Powered by blists - more mailing lists