linux-kernel - Re: [PATCH net-next v3 3/4] net: lan966x: Add FDMA functionality

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20220407071743.rsipmaq6xnucrlcw@soft-dev3-1.localhost>
Date:   Thu, 7 Apr 2022 09:17:43 +0200
From:   Horatiu Vultur <horatiu.vultur@...rochip.com>
To:     Jakub Kicinski <kuba@...nel.org>
CC:     <linux-kernel@...r.kernel.org>, <netdev@...r.kernel.org>,
        <UNGLinuxDriver@...rochip.com>, <davem@...emloft.net>,
        <pabeni@...hat.com>, <michael@...le.cc>
Subject: Re: [PATCH net-next v3 3/4] net: lan966x: Add FDMA functionality

The 04/06/2022 10:37, Jakub Kicinski wrote:
> 
> On Wed, 6 Apr 2022 13:21:15 +0200 Horatiu Vultur wrote:
> > > > +static int lan966x_fdma_tx_alloc(struct lan966x_tx *tx)
> > > > +{
> > > > +     struct lan966x *lan966x = tx->lan966x;
> > > > +     struct lan966x_tx_dcb *dcb;
> > > > +     struct lan966x_db *db;
> > > > +     int size;
> > > > +     int i, j;
> > > > +
> > > > +     tx->dcbs_buf = kcalloc(FDMA_DCB_MAX, sizeof(struct lan966x_tx_dcb_buf),
> > > > +                            GFP_ATOMIC);
> > > > +     if (!tx->dcbs_buf)
> > > > +             return -ENOMEM;
> > > > +
> > > > +     /* calculate how many pages are needed to allocate the dcbs */
> > > > +     size = sizeof(struct lan966x_tx_dcb) * FDMA_DCB_MAX;
> > > > +     size = ALIGN(size, PAGE_SIZE);
> > > > +     tx->dcbs = dma_alloc_coherent(lan966x->dev, size, &tx->dma, GFP_ATOMIC);
> > >
> > > This functions seems to only be called from probe, so GFP_KERNEL
> > > is better.
> >
> > But in the next patch of this series will be called while holding
> > the lan966x->tx_lock. Should I still change it to GFP_KERNEL and then
> > in the next one will change to GFP_ATOMIC?
> 
> Ah, I missed that. You can keep the GFP_ATOMIC then.
> 
> But I think the reconfig path may be racy. You disable Rx, but don't
> disable napi. NAPI may still be running and doing Rx while you're
> trying to free the rx skbs, no?

Yes, it is possible to have race conditions there. Even though I disable
the HW and make sure the RX FDMA is disabled. It could be that a frame
is received and then we get an interrupt and we just call napi_schedule.
At this point we change the MTU, and once we disable the HW and the RX
FDMA, then the napi_poll is called.
So I will make sure call napi_synchronize and napi_disable.

> 
> Once napi is disabled you can disable Tx and then you have full
> ownership of the Tx side, no need to hold the lock during
> lan966x_fdma_tx_alloc(), I'd think.

I can do that. The only thing is that I need to disable the Tx for all
the ports. Because the FDMA is shared by all the ports.

> 
> > > > +int lan966x_fdma_xmit(struct sk_buff *skb, __be32 *ifh, struct net_device *dev)
> > > > +{
> > > > +     struct lan966x_port *port = netdev_priv(dev);
> > > > +     struct lan966x *lan966x = port->lan966x;
> > > > +     struct lan966x_tx_dcb_buf *next_dcb_buf;
> > > > +     struct lan966x_tx_dcb *next_dcb, *dcb;
> > > > +     struct lan966x_tx *tx = &lan966x->tx;
> > > > +     struct lan966x_db *next_db;
> > > > +     int needed_headroom;
> > > > +     int needed_tailroom;
> > > > +     dma_addr_t dma_addr;
> > > > +     int next_to_use;
> > > > +     int err;
> > > > +
> > > > +     /* Get next index */
> > > > +     next_to_use = lan966x_fdma_get_next_dcb(tx);
> > > > +     if (next_to_use < 0) {
> > > > +             netif_stop_queue(dev);
> > > > +             return NETDEV_TX_BUSY;
> > > > +     }
> > > > +
> > > > +     if (skb_put_padto(skb, ETH_ZLEN)) {
> > > > +             dev->stats.tx_dropped++;
> > > > +             return NETDEV_TX_OK;
> > > > +     }
> > > > +
> > > > +     /* skb processing */
> > > > +     needed_headroom = max_t(int, IFH_LEN * sizeof(u32) - skb_headroom(skb), 0);
> > > > +     needed_tailroom = max_t(int, ETH_FCS_LEN - skb_tailroom(skb), 0);
> > > > +     if (needed_headroom || needed_tailroom || skb_header_cloned(skb)) {
> > > > +             err = pskb_expand_head(skb, needed_headroom, needed_tailroom,
> > > > +                                    GFP_ATOMIC);
> > > > +             if (unlikely(err)) {
> > > > +                     dev->stats.tx_dropped++;
> > > > +                     err = NETDEV_TX_OK;
> > > > +                     goto release;
> > > > +             }
> > > > +     }
> > > > +
> > > > +     skb_tx_timestamp(skb);
> > >
> > > This could move down after the dma mapping, so it's closer to when
> > > the devices gets ownership.
> >
> > The problem is that, if I move this lower, then the SKB is changed
> > because the IFH is added to the frame. So now if we do timestamping in
> > the PHY then when we call classify inside 'skb_clone_tx_timestamp'
> > will always return PTP_CLASS_NONE so the PHY will never get the frame.
> > That is the reason why I have move it back.
> 
> Oh, I see, makes sense!

-- 
/Horatiu