[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20210531213000.46143fad@kicinski-fedora-PC1C0HJN.hsd1.ca.comcast.net>
Date: Mon, 31 May 2021 21:30:00 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Steen Hegelund <steen.hegelund@...rochip.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Andrew Lunn <andrew@...n.ch>,
Russell King <linux@...linux.org.uk>,
Microchip Linux Driver Support <UNGLinuxDriver@...rochip.com>,
Alexandre Belloni <alexandre.belloni@...tlin.com>,
Madalin Bucur <madalin.bucur@....nxp.com>,
Mark Einon <mark.einon@...il.com>,
Masahiro Yamada <masahiroy@...nel.org>,
Arnd Bergmann <arnd@...db.de>,
Philipp Zabel <p.zabel@...gutronix.de>,
"Simon Horman" <simon.horman@...ronome.com>,
<netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<linux-arm-kernel@...ts.infradead.org>,
Bjarni Jonasson <bjarni.jonasson@...rochip.com>,
Lars Povlsen <lars.povlsen@...rochip.com>
Subject: Re: [PATCH net-next v2 03/10] net: sparx5: add hostmode with
phylink support
On Mon, 31 May 2021 16:02:54 +0200 Steen Hegelund wrote:
> > > + val = ether_addr_to_u64(sparx5->base_mac) + portno + 1;
> > > + u64_to_ether_addr(val, ndev->dev_addr);
> > > +
> > > + return ndev;
> > > +}
> >
> > > +static void sparx5_xtr_grp(struct sparx5 *sparx5, u8 grp, bool byte_swap)
> > > +{
> > > + bool eof_flag = false, pruned_flag = false, abort_flag = false;
> > > + struct net_device *netdev;
> > > + struct sparx5_port *port;
> > > + struct frame_info fi;
> > > + int i, byte_cnt = 0;
> > > + struct sk_buff *skb;
> > > + u32 ifh[IFH_LEN];
> > > + u32 *rxbuf;
> > > +
> > > + /* Get IFH */
> > > + for (i = 0; i < IFH_LEN; i++)
> > > + ifh[i] = spx5_rd(sparx5, QS_XTR_RD(grp));
> > > +
> > > + /* Decode IFH (whats needed) */
> > > + sparx5_ifh_parse(ifh, &fi);
> > > +
> > > + /* Map to port netdev */
> > > + port = fi.src_port < SPX5_PORTS ?
> > > + sparx5->ports[fi.src_port] : NULL;
> > > + if (!port || !port->ndev) {
> > > + dev_err(sparx5->dev, "Data on inactive port %d\n", fi.src_port);
> > > + sparx5_xtr_flush(sparx5, grp);
> > > + return;
> >
> > You should probably increment appropriate counter for each error
> > condition.
>
> At this first check I do not have the netdev, so it will not be
> possible to update any counters, but below I can use rx_dropped.
> Is that what you mean?
Yes, sorry, I just scrolled up to the earliest drop I could find.
Indeed nothing we can increment here.
> > > +
> > > + /* Finish up skb */
> > > + skb_put(skb, byte_cnt - ETH_FCS_LEN);
> > > + eth_skb_pad(skb);
> > > + skb->protocol = eth_type_trans(skb, netdev);
> > > + netif_rx(skb);
> > > + netdev->stats.rx_bytes += skb->len;
> > > + netdev->stats.rx_packets++;
> >
> > Does the Rx really need to happen in an interrupt context?
> > Did you consider using NAPI or a tasklet?
>
> This register base injection and extraction is just preliminary. I
> have the next series waiting with support for Frame DMA'ing and there
> I use NAPI, so if possible I would like to leave this as it is, since
> it only a stopgap.
Ah, that's fine.
> > What do you expect to happen at this point? Kernel can retry sending
> > for ever, is there a way for the driver to find out that the fifo is
> > no longer busy to stop/start the software queuing appropriately?
>
> Hmm. I am not too familiar with the netdev queuing, but would this
> be a way forward?
>
> 1) In sparx5_inject: After injecting a frame then test for HW queue
> readiness and watermark levels, and if there is a problem then call
> netif_queue_stop
>
> 2) Add an implementation of ndo_tx_timeout where the HW queue and
> Watermark level is checked and if all is OK, then do a
> netif_wake_queue.
timeout is not a good mechanism because it will print a stack trace and
an error to logs. timeout is used for detecting broken interfaced.
Perhaps use a hrtimer or a normal timer? What kind of time scales are
we talking here?
> 3) But if the HW queue and/or Watermark level is still not OK - then
> probably something went seriously wrong, or the wait was to short.
> Will the ndo_tx_timeout be called again or is this a one-off?
>
> If the ndo_tx_timeout call is a one-off the driver would need to
> reset the HW queue system or even deeper down...
Powered by blists - more mailing lists