linux-kernel - Re: [PATCH v4 06/11] net: ethernet: mtk-eth-mac: new driver

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK8P3a1nhPj6kRhwyXzDK3BGbh66XG6Fmp44QuM1NhFPPBTtPQ@mail.gmail.com>
Date:   Wed, 20 May 2020 23:22:45 +0200
From:   Arnd Bergmann <arnd@...db.de>
To:     Bartosz Golaszewski <brgl@...ev.pl>
Cc:     Stephane Le Provost <stephane.leprovost@...iatek.com>,
        Pedro Tsai <pedro.tsai@...iatek.com>,
        Andrew Perepech <andrew.perepech@...iatek.com>,
        Rob Herring <robh+dt@...nel.org>,
        "David S . Miller" <davem@...emloft.net>,
        Matthias Brugger <matthias.bgg@...il.com>,
        John Crispin <john@...ozen.org>,
        Sean Wang <sean.wang@...iatek.com>,
        Mark Lee <Mark-MC.Lee@...iatek.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Fabien Parent <fparent@...libre.com>,
        Heiner Kallweit <hkallweit1@...il.com>,
        Edwin Peer <edwin.peer@...adcom.com>,
        DTML <devicetree@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Networking <netdev@...r.kernel.org>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        "moderated list:ARM/Mediatek SoC..." 
        <linux-mediatek@...ts.infradead.org>,
        Bartosz Golaszewski <bgolaszewski@...libre.com>
Subject: Re: [PATCH v4 06/11] net: ethernet: mtk-eth-mac: new driver

On Wed, May 20, 2020 at 7:35 PM Bartosz Golaszewski <brgl@...ev.pl> wrote:
> śr., 20 maj 2020 o 16:37 Arnd Bergmann <arnd@...db.de> napisał(a):

> > I just noticed how the naming of NET_MEDIATEK_MAC and NET_MEDIATEK_SOC
> > for two different drivers doing the same thing is really confusing.
> >
> > Maybe someone can come up with a better name, such as one
> > based on the soc it first showed up in.
> >
>
> This has been discussed under one of the previous submissions.
> MediaTek wants to use this IP on future designs as well and it's
> already used on multiple SoCs so they want the name to be generic. I
> also argued that this is a driver strongly tied to a specific
> platform(s) so if someone wants to compile it - they probably know
> what they're doing.
>
> That being said: I verified with MediaTek and the name of the IP I can
> use is "star" so they proposed "mtk-star-eth". I would personally
> maybe go with "mtk-star-mac". How about those two?

Both seem fine to me. If this was previously discussed, I don't want
do further bike-shedding and I'd trust you to pick a sensible name
based on the earlier discussions.

> >  +               /* One of the counters reached 0x8000000 - update stats and
> > > +                * reset all counters.
> > > +                */
> > > +               if (unlikely(status & MTK_MAC_REG_INT_STS_MIB_CNT_TH)) {
> > > +                       mtk_mac_intr_disable_stats(priv);
> > > +                       schedule_work(&priv->stats_work);
> > > +               }
> > > + befor
> > > +               mtk_mac_intr_ack_all(priv);
> >
> > The ack here needs to be dropped, otherwise you can get further
> > interrupts before the bottom half has had a chance to run.
> >
>
> My thinking was this: if I mask the relevant interrupt (TX/RX
> complete) and ack it right away, the status bit will be asserted on
> the next packet received/sent but the process won't get interrupted
> and when I unmask it, it will fire right away and I won't have to
> recheck the status register. I noticed that if I ack it at the end of
> napi poll callback, I end up missing certain TX complete interrupts
> and end up seeing a lot of retransmissions even if I reread the status
> register. I'm not yet sure where this race happens.

Right, I see. If you just ack at the end of the poll function, you need
to check the rings again to ensure you did not miss an interrupt
between checking observing both rings to be empty and the irq-ack.

I suspect it's still cheaper to check the two rings with an uncached
read from memory than to to do the read-modify-write on the mmio,
but you'd have to measure that to be sure.

> > > +static void mtk_mac_tx_complete_all(struct mtk_mac_priv *priv)
> > > +{
> > > +       struct mtk_mac_ring *ring = &priv->tx_ring;
> > > +       struct net_device *ndev = priv->ndev;
> > > +       int ret, pkts_compl, bytes_compl;
> > > +       bool wake = false;
> > > +
> > > +       mtk_mac_lock(priv);
> > > +
> > > +       for (pkts_compl = 0, bytes_compl = 0;;
> > > +            pkts_compl++, bytes_compl += ret, wake = true) {
> > > +               if (!mtk_mac_ring_descs_available(ring))
> > > +                       break;
> > > +
> > > +               ret = mtk_mac_tx_complete_one(priv);
> > > +               if (ret < 0)
> > > +                       break;
> > > +       }
> > > +
> > > +       netdev_completed_queue(ndev, pkts_compl, bytes_compl);
> > > +
> > > +       if (wake && netif_queue_stopped(ndev))
> > > +               netif_wake_queue(ndev);
> > > +
> > > +       mtk_mac_intr_enable_tx(priv);
> >
> > No need to ack the interrupt here if napi is still active. Just
> > ack both rx and tx when calling napi_complete().
> >
> > Some drivers actually use the napi budget for both rx and tx:
> > if you have more than 'budget' completed tx frames, return
> > early from this function and skip the napi_complete even
> > when less than 'budget' rx frames have arrived.
> >
>
> IIRC Jakub said that the most seen approach is to free all TX descs
> and receive up to budget packets, so this is what I did. I think it
> makes the most sense.

Ok, he's probably right then.

My idea was that the dma_unmap operation for the tx cleanup is
rather expensive on chips without cache-coherent DMA, so you
might not want to do too much of it but rather do it in reasonably
sized batches. It would also avoid the case where you renable the
tx-complete interrupt after cleaning the already-sent frames but
then immediately get an irq when the next frame that is already
queued is done.

This probably depends on the specific workload which one works
better here.

         Arnd