[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CADZJnBaLPu=qhozN7gyof+vGGynVO=7cGfS05fEBbEj9Nmj67Q@mail.gmail.com>
Date: Wed, 22 Sep 2021 15:33:21 -0700
From: John Smith <4eur0pe2006@...il.com>
To: Andrew Lunn <andrew@...n.ch>, netdev@...r.kernel.org
Subject: Re: stmmac: Disappointing or normal DMA performance?
On Wed, Sep 22, 2021 at 12:35 PM Andrew Lunn <andrew@...n.ch> wrote:
>
> On Wed, Sep 22, 2021 at 01:48:36AM -0700, John Smith wrote:
> > I have a one-way 300Mbs traffic RGMII arriving at a stmmac version
> > 3.7, in the form of 30000 1280-byte frames per second, evenly spread.
> >
> > In NAPI poll mode, at each DMA interrupt, I get around 10 frames. More
> > precisely:
> >
> > In stmmac_rx of stmmac_main.c:
> >
> > static int stmmac_rx(struct stmmac_priv *priv, int limit) {
> > ...
> > while (count < limit)
> >
> > count is around 10 when NAPI limit/weight is 64. It means that I get
> > 3000 DMA IRQs per second for my 30000 packets.
>
> I assume it exists the loop here:
>
> /* check if managed by the DMA otherwise go ahead */
> if (unlikely(status & dma_own))
> break;
>
> Calling stmmac_display_ring() every interrupt is too expensive, but
> maybe do it every 1000. Extend the dump so it includes des0. You can
> then check there really are 10 packets ready to be received, not more?
>
> I suppose another interesting thing to try. Get the driver to do
> nothing every other RX interrupt. Do you get the same number of frames
> per second, but now 20 per stmmac_rx()? That will tell you if it is
> some sort of hardware limit or not. I guess then check that interrupt
> disable/enable is actually being performed, is it swapping between
> interrupt driven and polling?
>
> Andrew
Yes, the stmmac_rx returns at the dman_own test after 10 frames for
each interrupt.
I tried to override that line but obviously, it leads to crashes.
The problem is that the hardware watchdog with its maximum value 0xff,
returns after 326us generating the interrupt. I think that it's
triggered when the internal rx fifo in the hardware block is full.
If I understand well, your suggestion is to live with the interrupt
but not do the heavy lifting in each callback, instead call stmmac_rx
every 10 or 100 DMA callback. I have tried a simple hack but it
doesn't seem to work. Perhaps I misunderstood the suggestion. I'm not
sure it's going to work because the DMA buffer needs to be emptied and
blanked at each interrupt otherwise data is going to be lost or the
driver is going to be unhappy.
Also, if I send frames of 1280x3, I get 1/3 interrupts so it really
seems to be a FIFO depth limitation.
Thank you for helping! I think that only the ST team knows the
hardware limitation and they would have some ideas of a workaround for
this specific case but they don't reply...
John
Powered by blists - more mailing lists