[<prev] [next>] [day] [month] [year] [list]
Message-ID:
<DM8PR12MB5447837576EA58F490D6D4BFAD052@DM8PR12MB5447.namprd12.prod.outlook.com>
Date: Wed, 18 Dec 2024 11:22:33 +0000
From: Alex Lazar <alazar@...dia.com>
To: "jdamato@...tly.com" <jdamato@...tly.com>
CC: "aleksander.lobakin@...el.com" <aleksander.lobakin@...el.com>,
"almasrymina@...gle.com" <almasrymina@...gle.com>,
"amritha.nambiar@...el.com" <amritha.nambiar@...el.com>,
"bigeasy@...utronix.de" <bigeasy@...utronix.de>, "bjorn@...osinc.com"
<bjorn@...osinc.com>, "corbet@....net" <corbet@....net>, Dan Jurgens
<danielj@...dia.com>, "davem@...emloft.net" <davem@...emloft.net>,
"donald.hunter@...il.com" <donald.hunter@...il.com>, "dsahern@...nel.org"
<dsahern@...nel.org>, "edumazet@...gle.com" <edumazet@...gle.com>,
"hawk@...nel.org" <hawk@...nel.org>, "jiri@...nulli.us" <jiri@...nulli.us>,
"johannes.berg@...el.com" <johannes.berg@...el.com>, "kuba@...nel.org"
<kuba@...nel.org>, "leitao@...ian.org" <leitao@...ian.org>, "leon@...nel.org"
<leon@...nel.org>, "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"lorenzo@...nel.org" <lorenzo@...nel.org>, "michael.chan@...adcom.com"
<michael.chan@...adcom.com>, "mkarsten@...terloo.ca" <mkarsten@...terloo.ca>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>, "pabeni@...hat.com"
<pabeni@...hat.com>, Saeed Mahameed <saeedm@...dia.com>, "sdf@...ichev.me"
<sdf@...ichev.me>, "skhawaja@...gle.com" <skhawaja@...gle.com>,
"sridhar.samudrala@...el.com" <sridhar.samudrala@...el.com>, Tariq Toukan
<tariqt@...dia.com>, "willemdebruijn.kernel@...il.com"
<willemdebruijn.kernel@...il.com>, "xuanzhuo@...ux.alibaba.com"
<xuanzhuo@...ux.alibaba.com>, Gal Pressman <gal@...dia.com>, Nimrod Oren
<noren@...dia.com>, Dror Tennenbaum <drort@...dia.com>, Dragos Tatulea
<dtatulea@...dia.com>
Subject: Re: [net-next v6 0/9] Add support for per-NAPI config via netlink
Hi Joe and all,
I am part of the NVIDIA Eth drivers team, and we are experiencing a problem,
sibesced to this change: commit 86e25f40aa1e ("net: napi: Add napi_config")
The issue occurs when sending packets from one machine to another.
On the receiver side, we have XSK (XDPsock) that receives the packet and sends it
back to the sender.
At some point, one packet (packet A) gets "stuck," and if we send a new packet
(packet B), it "pushes" the previous one. Packet A is then processed by the NAPI
poll, and packet B gets stuck, and so on.
Your change involves moving napi_hash_del() and napi_hash_add() from
netif_napi_del() and netif_napi_add_weight() to napi_enable() and napi_disable().
If I move them back to netif_napi_del() and netif_napi_add_weight(),
the issue is resolved (I moved the entire if/else block, not just the napi_hash_del/add).
This issue occurs with both the new and old APIs (netif_napi_add/_config).
Moving the napi_hash_add() and napi_hash_del() functions resolves it for both.
I am debugging this, no breakthrough so far.
I would appreciate if you could look into this.
We can provide more details per request.
Regards,
Alexei Lazar
Powered by blists - more mailing lists