[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bd1a2f1f-20bb-4cc0-82ed-150c5e36b1da@kernel.org>
Date: Fri, 15 Aug 2025 18:02:10 +0200
From: Jesper Dangaard Brouer <hawk@...nel.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Dragos Tatulea <dtatulea@...dia.com>, Chris Arges
<carges@...udflare.com>, Jesse Brandeburg <jbrandeburg@...udflare.com>,
netdev@...r.kernel.org, bpf@...r.kernel.org,
kernel-team <kernel-team@...udflare.com>, tariqt@...dia.com,
saeedm@...dia.com, Leon Romanovsky <leon@...nel.org>,
Andrew Lunn <andrew+netdev@...n.ch>, "David S. Miller"
<davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Paolo Abeni <pabeni@...hat.com>, Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
John Fastabend <john.fastabend@...il.com>, Simon Horman <horms@...nel.org>,
Andrew Rzeznik <arzeznik@...udflare.com>, Yan Zhai <yan@...udflare.com>,
Kumar Kartikeya Dwivedi <memxor@...il.com>,
Mina Almasry <almasrymina@...gle.com>
Subject: Re: [BUG] mlx5_core memory management issue
On 15/08/2025 16.59, Jakub Kicinski wrote:
> On Thu, 14 Aug 2025 17:58:21 +0200 Jesper Dangaard Brouer wrote:
>> Found-by: Dragos Tatulea <dtatulea@...dia.com>
>
> ENOSUCHTAG?
>
I pre-checked that "Found-by:" have already been used 32 times in git-
history. But don't worry, Martin applied it such that it isn't in the
tags section, by removing the ":" and placing it in the desc part.
>> Reported-by: Chris Arges <carges@...udflare.com>
>
>>>> The XDP code have evolved since the xdp_set_return_frame_no_direct()
>>>> calls were added. Now page_pool keeps track of pp->napi and
>>>> pool-> cpuid. Maybe the __xdp_return [1] checks should be updated?
>>>> (and maybe it allows us to remove the no_direct helpers).
>>>>
>>> So you mean to drop the napi_direct flag in __xdp_return and let
>>> page_pool_put_unrefed_netmem() decide if direct should be used by
>>> page_pool_napi_local()?
>>
>> Yes, something like that, but I would like Kuba/Jakub's input, as IIRC
>> he introduced the page_pool->cpuid and page_pool->napi.
>>
>> There are some corner-cases we need to consider if they are valid. If
>> cpumap get redirected to the *same* CPU as "previous" NAPI instance,
>> which then makes page_pool->cpuid match, is it then still valid to do
>> "direct" return(?).
>
> I think/hope so, but it depends on xdp_return only being called from
> softirq context.. Since softirqs can't nest if producer and consumer
> of the page pool pages are on the same CPU they can't race.
That is true, softirqs can't nest.
Jesse pointed me at the tun device driver, where we in-principle are
missing a xdp_set_return_frame_no_direct() section. Except I believe,
that the memory type cannot be page_pool in this driver. (Code hint,
tun_xdp_act() calls xdp_do_redirect).
The tun driver made me realize, that we do have users that doesn't run
under a softirq, but they do remember to disable BH. (IIRC BH-disable
can nest). Are we also race safe in this case(?).
Is the code change as simple as below or did I miss something?
void __xdp_return
[...]
case MEM_TYPE_PAGE_POOL:
[...]
if (napi_direct && READ_ONCE(pool->cpuid) != smp_processor_id())
napi_direct = false;
It is true, that when we exit NAPI, then pool->cpuid becomes -1.
Or what that only during shutdown?
> I'm slightly worried that drivers which don't have dedicated Tx XDP
> rings will clean it up from hard IRQ when netpoll calls. But that'd
> be a bug, right? We don't allow XDP processing from IRQ context.
I didn't consider this code path. But, yes that would be considered a
netpoll bug IMHO.
--Jesper
Powered by blists - more mailing lists