[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1809a34d-dcf4-4b54-089a-a7be3f4c23e1@intel.com>
Date: Wed, 12 Apr 2023 15:30:38 -0700
From: Jacob Keller <jacob.e.keller@...el.com>
To: Kurt Kanzenbach <kurt@...utronix.de>,
Jesse Brandeburg <jesse.brandeburg@...el.com>,
Tony Nguyen <anthony.l.nguyen@...el.com>
CC: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Vinicius Costa Gomes <vinicius.gomes@...el.com>,
"Ong Boon Leong" <boon.leong.ong@...el.com>,
<intel-wired-lan@...ts.osuosl.org>, <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next] igc: Avoid transmit queue timeout for XDP
On 4/12/2023 12:36 AM, Kurt Kanzenbach wrote:
> High XDP load triggers the netdev watchdog:
>
> |NETDEV WATCHDOG: enp3s0 (igc): transmit queue 2 timed out
>
> The reason is the Tx queue transmission start (txq->trans_start) is not updated
> in XDP code path. Therefore, add it for all XDP transmission functions.
>
> Signed-off-by: Kurt Kanzenbach <kurt@...utronix.de>
For Intel, I only see this being done in igb, as 5337824f4dc4 ("net:
annotate accesses to queue->trans_start"). I see a few other drivers
also calling this.
Is this a gap that other XDP implementations also need to fix?
grepping for txq_trans_cond_update I see:
> apm/xgene/xgene_enet_main.c
> 874: txq_trans_cond_update(txq);
>
> engleder/tsnep_main.c
> 623: txq_trans_cond_update(tx_nq);
> 1660: txq_trans_cond_update(nq);
>
> freescale/dpaa/dpaa_eth.c
> 2347: txq_trans_cond_update(txq);
> 2553: txq_trans_cond_update(txq);
>
> ibm/ibmvnic.c
> 2485: txq_trans_cond_update(txq);
>
> intel/igb/igb_main.c
> 2980: txq_trans_cond_update(nq);
> 3014: txq_trans_cond_update(nq);
>
> stmicro/stmmac/stmmac_main.c
> 2428: txq_trans_cond_update(nq);
> 4808: txq_trans_cond_update(nq);
> 6436: txq_trans_cond_update(nq);
>
Is most driver's XDP implementation broken? There's also
netif_trans_update but this is called out as a legacy only function. Far
more drivers call this but I don't see either call or a direct update to
trans_start in many XDP implementations...
Am I missing something or are a bunch of other XDP implementations also
wrong?
The patch seems ok to me, assuming this is the correct way to fix things
and not something in the XDP path.
> ---
> drivers/net/ethernet/intel/igc/igc_main.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
> index ba49728be919..e71e85e3bcc2 100644
> --- a/drivers/net/ethernet/intel/igc/igc_main.c
> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
> @@ -2384,6 +2384,8 @@ static int igc_xdp_xmit_back(struct igc_adapter *adapter, struct xdp_buff *xdp)
> nq = txring_txq(ring);
>
> __netif_tx_lock(nq, cpu);
> + /* Avoid transmit queue timeout since we share it with the slow path */
> + txq_trans_cond_update(nq);
> res = igc_xdp_init_tx_descriptor(ring, xdpf);
> __netif_tx_unlock(nq);
> return res;
> @@ -2786,6 +2788,9 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring)
>
> __netif_tx_lock(nq, cpu);
>
> + /* Avoid transmit queue timeout since we share it with the slow path */
> + txq_trans_cond_update(nq);
> +
> budget = igc_desc_unused(ring);
>
> while (xsk_tx_peek_desc(pool, &xdp_desc) && budget--) {
> @@ -6311,6 +6316,9 @@ static int igc_xdp_xmit(struct net_device *dev, int num_frames,
>
> __netif_tx_lock(nq, cpu);
>
> + /* Avoid transmit queue timeout since we share it with the slow path */
> + txq_trans_cond_update(nq);
> +
> drops = 0;
> for (i = 0; i < num_frames; i++) {
> int err;
Powered by blists - more mailing lists