netdev - Re: [PATCH net-next v2 03/10] net: sparx5: add hostmode with phylink support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20210531213000.46143fad@kicinski-fedora-PC1C0HJN.hsd1.ca.comcast.net>
Date:   Mon, 31 May 2021 21:30:00 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     Steen Hegelund <steen.hegelund@...rochip.com>
Cc:     "David S. Miller" <davem@...emloft.net>,
        Andrew Lunn <andrew@...n.ch>,
        Russell King <linux@...linux.org.uk>,
        Microchip Linux Driver Support <UNGLinuxDriver@...rochip.com>,
        Alexandre Belloni <alexandre.belloni@...tlin.com>,
        Madalin Bucur <madalin.bucur@....nxp.com>,
        Mark Einon <mark.einon@...il.com>,
        Masahiro Yamada <masahiroy@...nel.org>,
        Arnd Bergmann <arnd@...db.de>,
        Philipp Zabel <p.zabel@...gutronix.de>,
        "Simon Horman" <simon.horman@...ronome.com>,
        <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <linux-arm-kernel@...ts.infradead.org>,
        Bjarni Jonasson <bjarni.jonasson@...rochip.com>,
        Lars Povlsen <lars.povlsen@...rochip.com>
Subject: Re: [PATCH net-next v2 03/10] net: sparx5: add hostmode with
 phylink support

On Mon, 31 May 2021 16:02:54 +0200 Steen Hegelund wrote:
> > > +     val = ether_addr_to_u64(sparx5->base_mac) + portno + 1;
> > > +     u64_to_ether_addr(val, ndev->dev_addr);
> > > +
> > > +     return ndev;
> > > +}  
> >   
> > > +static void sparx5_xtr_grp(struct sparx5 *sparx5, u8 grp, bool byte_swap)
> > > +{
> > > +     bool eof_flag = false, pruned_flag = false, abort_flag = false;
> > > +     struct net_device *netdev;
> > > +     struct sparx5_port *port;
> > > +     struct frame_info fi;
> > > +     int i, byte_cnt = 0;
> > > +     struct sk_buff *skb;
> > > +     u32 ifh[IFH_LEN];
> > > +     u32 *rxbuf;
> > > +
> > > +     /* Get IFH */
> > > +     for (i = 0; i < IFH_LEN; i++)
> > > +             ifh[i] = spx5_rd(sparx5, QS_XTR_RD(grp));
> > > +
> > > +     /* Decode IFH (whats needed) */
> > > +     sparx5_ifh_parse(ifh, &fi);
> > > +
> > > +     /* Map to port netdev */
> > > +     port = fi.src_port < SPX5_PORTS ?
> > > +             sparx5->ports[fi.src_port] : NULL;
> > > +     if (!port || !port->ndev) {
> > > +             dev_err(sparx5->dev, "Data on inactive port %d\n", fi.src_port);
> > > +             sparx5_xtr_flush(sparx5, grp);
> > > +             return;  
> > 
> > You should probably increment appropriate counter for each error
> > condition.  
> 
> At this first check I do not have the netdev, so it will not be
> possible to update any counters, but below I can use rx_dropped.  
> Is that what you mean?

Yes, sorry, I just scrolled up to the earliest drop I could find.
Indeed nothing we can increment here. 

> > > +
> > > +     /* Finish up skb */
> > > +     skb_put(skb, byte_cnt - ETH_FCS_LEN);
> > > +     eth_skb_pad(skb);
> > > +     skb->protocol = eth_type_trans(skb, netdev);
> > > +     netif_rx(skb);
> > > +     netdev->stats.rx_bytes += skb->len;
> > > +     netdev->stats.rx_packets++;  
> > 
> > Does the Rx really need to happen in an interrupt context?
> > Did you consider using NAPI or a tasklet?  
> 
> This register base injection and extraction is just preliminary.  I
> have the next series waiting with support for Frame DMA'ing and there
> I use NAPI, so if possible I would like to leave this as it is, since
> it only a stopgap.

Ah, that's fine.

> > What do you expect to happen at this point? Kernel can retry sending
> > for ever, is there a way for the driver to find out that the fifo is
> > no longer busy to stop/start the software queuing appropriately?  
> 
> Hmm.  I am not too familiar with the netdev queuing, but would this
> be a way forward?
> 
> 1) In sparx5_inject: After injecting a frame then test for HW queue
> readiness and watermark levels, and if there is a problem then call
> netif_queue_stop
> 
> 2) Add an implementation of ndo_tx_timeout where the HW queue and
> Watermark level is checked and if all is OK, then do a
> netif_wake_queue.

timeout is not a good mechanism because it will print a stack trace and
an error to logs. timeout is used for detecting broken interfaced.
Perhaps use a hrtimer or a normal timer? What kind of time scales are
we talking here?

> 3) But if the HW queue and/or Watermark level is still not OK - then
> probably something went seriously wrong, or the wait was to short.
> Will the ndo_tx_timeout be called again or is this a one-off?
> 
> If the ndo_tx_timeout call is a one-off the driver would need to
> reset the HW queue system or even deeper down...