[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMRc=MeVyNzTWw_hk=J9kX1NE9reCE_O4P3wrNpMMc9z4xA_DA@mail.gmail.com>
Date: Mon, 18 May 2020 16:07:23 +0200
From: Bartosz Golaszewski <brgl@...ev.pl>
To: Arnd Bergmann <arnd@...db.de>
Cc: Jonathan Corbet <corbet@....net>, Rob Herring <robh+dt@...nel.org>,
"David S . Miller" <davem@...emloft.net>,
Matthias Brugger <matthias.bgg@...il.com>,
John Crispin <john@...ozen.org>,
Sean Wang <sean.wang@...iatek.com>,
Mark Lee <Mark-MC.Lee@...iatek.com>,
Jakub Kicinski <kuba@...nel.org>,
Fabien Parent <fparent@...libre.com>,
Heiner Kallweit <hkallweit1@...il.com>,
Edwin Peer <edwin.peer@...adcom.com>,
DTML <devicetree@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Networking <netdev@...r.kernel.org>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
"moderated list:ARM/Mediatek SoC..."
<linux-mediatek@...ts.infradead.org>,
Stephane Le Provost <stephane.leprovost@...iatek.com>,
Pedro Tsai <pedro.tsai@...iatek.com>,
Andrew Perepech <andrew.perepech@...iatek.com>,
Bartosz Golaszewski <bgolaszewski@...libre.com>
Subject: Re: [PATCH v3 10/15] net: ethernet: mtk-eth-mac: new driver
pt., 15 maj 2020 o 15:32 Arnd Bergmann <arnd@...db.de> napisaĆ(a):
>
> On Thu, May 14, 2020 at 10:00 AM Bartosz Golaszewski <brgl@...ev.pl> wrote:
> > +static int mtk_mac_ring_pop_tail(struct mtk_mac_ring *ring,
> > + struct mtk_mac_ring_desc_data *desc_data)
>
> I took another look at this function because of your comment on the locking
> the descriptor updates, which seemed suspicious as the device side does not
> actually use the locks to access them
>
> > +{
> > + struct mtk_mac_ring_desc *desc = &ring->descs[ring->tail];
> > + unsigned int status;
> > +
> > + /* Let the device release the descriptor. */
> > + dma_rmb();
> > + status = desc->status;
> > + if (!(status & MTK_MAC_DESC_BIT_COWN))
> > + return -1;
>
> The dma_rmb() seems odd here, as I don't see which prior read
> is being protected by this.
>
> > + desc_data->len = status & MTK_MAC_DESC_MSK_LEN;
> > + desc_data->flags = status & ~MTK_MAC_DESC_MSK_LEN;
> > + desc_data->dma_addr = ring->dma_addrs[ring->tail];
> > + desc_data->skb = ring->skbs[ring->tail];
> > +
> > + desc->data_ptr = 0;
> > + desc->status = MTK_MAC_DESC_BIT_COWN;
> > + if (status & MTK_MAC_DESC_BIT_EOR)
> > + desc->status |= MTK_MAC_DESC_BIT_EOR;
> > +
> > + /* Flush writes to descriptor memory. */
> > + dma_wmb();
>
> The comment and the barrier here seem odd as well. I would have expected
> a barrier after the update to the data pointer, and only a single store
> but no read of the status flag instead of the read-modify-write,
> something like
>
> desc->data_ptr = 0;
> dma_wmb(); /* make pointer update visible before status update */
> desc->status = MTK_MAC_DESC_BIT_COWN | (status & MTK_MAC_DESC_BIT_EOR);
>
> > + ring->tail = (ring->tail + 1) % MTK_MAC_RING_NUM_DESCS;
> > + ring->count--;
>
> I would get rid of the 'count' here, as it duplicates the information
> that is already known from the difference between head and tail, and you
> can't update it atomically without holding a lock around the access to
> the ring. The way I'd do this is to have the head and tail pointers
> in separate cache lines, and then use READ_ONCE/WRITE_ONCE
> and smp barriers to access them, with each one updated on one
> thread but read by the other.
>
Your previous solution seems much more reliable though. For instance
in the above: when we're doing the TX cleanup (we got the TX ready
irq, we're iterating over descriptors until we know there are no more
packets scheduled (count == 0) or we encounter one that's still owned
by DMA), a parallel TX path can schedule new packets to be sent and I
don't see how we can atomically check the count (understood as a
difference between tail and head) and run a new iteration (where we'd
modify the head or tail) without risking the other path getting in the
way. We'd have to always check the descriptor.
I experimented a bit with this and couldn't come up with anything that
would pass any stress test.
On the other hand: spin_lock_bh() works fine and I like your approach
from the previous e-mail - except for the work for updating stats as
we could potentially lose some stats when we're updating in process
context with RX/TX paths running in parallel in napi context but that
would be rare enough to overlook it.
I hope v4 will be good enough even with spinlocks. :)
Bart
Powered by blists - more mailing lists