[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <874jpk2qp2.fsf@kurt>
Date: Thu, 13 Apr 2023 09:20:57 +0200
From: Kurt Kanzenbach <kurt@...utronix.de>
To: Jacob Keller <jacob.e.keller@...el.com>,
Jesse Brandeburg <jesse.brandeburg@...el.com>,
Tony Nguyen <anthony.l.nguyen@...el.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Vinicius Costa Gomes <vinicius.gomes@...el.com>,
Ong Boon Leong <boon.leong.ong@...el.com>,
intel-wired-lan@...ts.osuosl.org, netdev@...r.kernel.org
Subject: Re: [PATCH net-next] igc: Avoid transmit queue timeout for XDP
On Wed Apr 12 2023, Jacob Keller wrote:
> On 4/12/2023 12:36 AM, Kurt Kanzenbach wrote:
>> High XDP load triggers the netdev watchdog:
>>
>> |NETDEV WATCHDOG: enp3s0 (igc): transmit queue 2 timed out
>>
>> The reason is the Tx queue transmission start (txq->trans_start) is not updated
>> in XDP code path. Therefore, add it for all XDP transmission functions.
>>
>> Signed-off-by: Kurt Kanzenbach <kurt@...utronix.de>
>
> For Intel, I only see this being done in igb, as 5337824f4dc4 ("net:
> annotate accesses to queue->trans_start"). I see a few other drivers
> also calling this.
>
> Is this a gap that other XDP implementations also need to fix?
>
> grepping for txq_trans_cond_update I see:
>
>> apm/xgene/xgene_enet_main.c
>> 874: txq_trans_cond_update(txq);
>>
>> engleder/tsnep_main.c
>> 623: txq_trans_cond_update(tx_nq);
>> 1660: txq_trans_cond_update(nq);
>>
>> freescale/dpaa/dpaa_eth.c
>> 2347: txq_trans_cond_update(txq);
>> 2553: txq_trans_cond_update(txq);
>>
>> ibm/ibmvnic.c
>> 2485: txq_trans_cond_update(txq);
>>
>> intel/igb/igb_main.c
>> 2980: txq_trans_cond_update(nq);
>> 3014: txq_trans_cond_update(nq);
>>
>> stmicro/stmmac/stmmac_main.c
>> 2428: txq_trans_cond_update(nq);
>> 4808: txq_trans_cond_update(nq);
>> 6436: txq_trans_cond_update(nq);
>>
>
> Is most driver's XDP implementation broken? There's also
> netif_trans_update but this is called out as a legacy only function. Far
> more drivers call this but I don't see either call or a direct update to
> trans_start in many XDP implementations...
>
> Am I missing something or are a bunch of other XDP implementations also
> wrong?
>
> The patch seems ok to me, assuming this is the correct way to fix things
> and not something in the XDP path.
AFAICT the netdev watchdog is only started when the device exposes
ndo_tx_timeout callback (see __netdev_watchdog_up()). For igc this
callback was introduced recently in 9b275176270e ("igc: Add
ndo_tx_timeout support"). My guess, as soon as the net device has
ndo_tx_timeout it needs to maintain trans_start for XDP?
Thanks,
Kurt
Download attachment "signature.asc" of type "application/pgp-signature" (862 bytes)
Powered by blists - more mailing lists