[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <82bcd959-571e-42ce-b341-cbfa19f9f86d@linux.microsoft.com>
Date: Wed, 5 Nov 2025 22:10:23 +0530
From: Aditya Garg <gargaditya@...ux.microsoft.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: kys@...rosoft.com, haiyangz@...rosoft.com, wei.liu@...nel.org,
decui@...rosoft.com, andrew+netdev@...n.ch, davem@...emloft.net,
edumazet@...gle.com, pabeni@...hat.com, longli@...rosoft.com,
kotaranov@...rosoft.com, horms@...nel.org, shradhagupta@...ux.microsoft.com,
ssengar@...ux.microsoft.com, ernis@...ux.microsoft.com,
dipayanroy@...ux.microsoft.com, shirazsaleem@...rosoft.com,
linux-hyperv@...r.kernel.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org,
gargaditya@...rosoft.com
Subject: Re: [PATCH net-next v2] net: mana: Handle SKB if TX SGEs exceed
hardware limit
On 01-11-2025 04:56, Jakub Kicinski wrote:
> On Wed, 29 Oct 2025 06:12:35 -0700 Aditya Garg wrote:
>> @@ -289,6 +290,21 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev)
>> cq = &apc->tx_qp[txq_idx].tx_cq;
>> tx_stats = &txq->stats;
>>
>> + if (MAX_SKB_FRAGS + 2 > MAX_TX_WQE_SGL_ENTRIES &&
>> + skb_shinfo(skb)->nr_frags + 2 > MAX_TX_WQE_SGL_ENTRIES) {
>> + /* GSO skb with Hardware SGE limit exceeded is not expected here
>> + * as they are handled in mana_features_check() callback
>> + */
>> + if (skb_is_gso(skb))
>> + netdev_warn_once(ndev, "GSO enabled skb exceeds max SGE limit\n");
>
> This could be the same question Simon asked but why do you think you
> need this line? Sure you need to linearize non-GSO but why do you care
> to warn specifically about GSO?! Looks like defensive programming or
> testing leftover..
>
Hi Jakub,
Agreed, The GSO specific warning is redundant. I'll drop it in next
revision.
>> + if (skb_linearize(skb)) {
>> + netdev_warn_once(ndev, "Failed to linearize skb with nr_frags=%d and is_gso=%d\n",
>> + skb_shinfo(skb)->nr_frags,
>> + skb_is_gso(skb));
>
> .. in practice including is_gso() here as you do is probably enough for
> debug
>
Ok
>> + goto tx_drop_count;
>> + }
>> + }
>> +
>> pkg.tx_oob.s_oob.vcq_num = cq->gdma_id;
>> pkg.tx_oob.s_oob.vsq_frame = txq->vsq_frame;
>>
>> @@ -402,8 +418,6 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev)
>> }
>> }
>>
>> - WARN_ON_ONCE(pkg.wqe_req.num_sge > MAX_TX_WQE_SGL_ENTRIES);
>> -
>> if (pkg.wqe_req.num_sge <= ARRAY_SIZE(pkg.sgl_array)) {
>> pkg.wqe_req.sgl = pkg.sgl_array;
>> } else {
>> @@ -438,9 +452,13 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev)
>>
>> if (err) {
>> (void)skb_dequeue_tail(&txq->pending_skbs);
>> + mana_unmap_skb(skb, apc);
>> netdev_warn(ndev, "Failed to post TX OOB: %d\n", err);
>
> You have a print right here and in the callee. This condition must
> (almost) never happen in practice. It's likely fine to just drop
> the packet.
> The logs placed in callee doesn't covers all the failure scenarios,
hence I feel to have this log here with proper status. Maybe I can
remove the log in the callee?
> Either way -- this should be a separate patch.
>
Are you suggesting a separate patch altogether or two patch in the same
series?
Based on your suggestion i can work on v3.
Regards,
Aditya
>> - err = NETDEV_TX_BUSY;
>> - goto tx_busy;
>> + if (err == -ENOSPC) {
>> + err = NETDEV_TX_BUSY;
>> + goto tx_busy;
>> + }
>> + goto free_sgl_ptr;
>> }
>>
>> err = NETDEV_TX_OK;
>> @@ -478,6 +496,25 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev)
>> return NETDEV_TX_OK;
>> }
Powered by blists - more mailing lists