[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210811161658.64551a1c@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date: Wed, 11 Aug 2021 16:16:58 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Michael Chan <michael.chan@...adcom.com>
Cc: David Miller <davem@...emloft.net>,
Jeffrey Huang <huangjw@...adcom.com>,
Eddie Wai <eddie.wai@...adcom.com>,
Prashant Sreedharan <prashant@...adcom.com>,
Andrew Gospodarek <gospo@...adcom.com>,
Netdev <netdev@...r.kernel.org>,
Edwin Peer <edwin.peer@...adcom.com>
Subject: Re: [PATCH net v2 3/4] bnxt: make sure xmit_more + errors does not
miss doorbells
On Wed, 11 Aug 2021 16:00:52 -0700 Michael Chan wrote:
> On Wed, Aug 11, 2021 at 3:44 PM Jakub Kicinski <kuba@...nel.org> wrote:
> > On Wed, 11 Aug 2021 15:36:34 -0700 Michael Chan wrote:
> > > On Wed, Aug 11, 2021 at 2:38 PM Jakub Kicinski <kuba@...nel.org> wrote:
> > > > @@ -367,6 +368,13 @@ static u16 bnxt_xmit_get_cfa_action(struct sk_buff *skb)
> > > > return md_dst->u.port_info.port_id;
> > > > }
> > > >
> > > > +static void bnxt_txr_db_kick(struct bnxt *bp, struct bnxt_tx_ring_info *txr,
> > > > + u16 prod)
> > > > +{
> > > > + bnxt_db_write(bp, &txr->tx_db, prod);
> > > > + txr->kick_pending = 0;
> > > > +}
> > > > +
> > > > static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
> > > > {
> > > > struct bnxt *bp = netdev_priv(dev);
> > > > @@ -396,6 +404,8 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
> > > > free_size = bnxt_tx_avail(bp, txr);
> > > > if (unlikely(free_size < skb_shinfo(skb)->nr_frags + 2)) {
> > > > netif_tx_stop_queue(txq);
> > > > + if (net_ratelimit() && txr->kick_pending)
> > > > + netif_warn(bp, tx_err, dev, "bnxt: ring busy!\n");
> > >
> > > You forgot to remove this.
> >
> > I changed my mind. I added the && txr->kick_pending to the condition,
> > if there is a race and napi starts the queue unnecessarily the kick
> > can't be pending.
>
> I don't understand. The queue should be stopped if we have <=
> MAX_SKB_FRAGS + 1 descriptors left. If there is a race and the queue
> is awake, the first TX packet may slip through if
> skb_shinfo(skb)->nr_frags is small and we have enough descriptors for
> it. Let's say xmit_more is set for this packet and so kick is
> pending. The next packet may not fit anymore and it will hit this
> check here.
But even if we slip past this check we can only do it once, the check
at the end of start_xmit() will see we have fewer slots than MAX_FRAGS
+ 2, ring the doorbell and stop.
> > > > @@ -661,7 +668,12 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
> > > > PCI_DMA_TODEVICE);
> > > > }
> > > >
> > > > +tx_free:
> > > > dev_kfree_skb_any(skb);
> > > > +tx_kick_pending:
> > > > + tx_buf->skb = NULL;
> > >
> > > I think we should remove the setting of tx_buf->skb to NULL in the
> > > tx_dma_error path since we are setting it here now.
> >
> > Are you suggesting to do something along the lines of:
> >
> > txr->tx_buf_ring[txr->tx_prod].skb = NULL;
>
> Yeah, I like this the best.
Roger that, I'll send v3 tomorrow, I run out of day.
Powered by blists - more mailing lists