[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <33ba4e9cde1ccd1c9f561873782478a913eab670.camel@redhat.com>
Date: Tue, 27 Feb 2024 13:53:03 +0100
From: Paolo Abeni <pabeni@...hat.com>
To: Doug Berger <opendmb@...il.com>, Florian Fainelli
<florian.fainelli@...adcom.com>, Maarten Vanraes <maarten@...il.be>
Cc: netdev@...r.kernel.org, Broadcom internal kernel review list
<bcm-kernel-feedback-list@...adcom.com>, Phil Elwell <phil@...pberrypi.com>
Subject: Re: [PATCH] net: bcmgenet: Reset RBUF on first open
On Mon, 2024-02-26 at 15:13 -0800, Doug Berger wrote:
> On 2/26/2024 9:34 AM, Florian Fainelli wrote:
> > On 2/23/24 15:53, Maarten Vanraes wrote:
> > > From: Phil Elwell <phil@...pberrypi.com>
> > >
> > > If the RBUF logic is not reset when the kernel starts then there
> > > may be some data left over from any network boot loader. If the
> > > 64-byte packet headers are enabled then this can be fatal.
> > >
> > > Extend bcmgenet_dma_disable to do perform the reset, but not when
> > > called from bcmgenet_resume in order to preserve a wake packet.
> > >
> > > N.B. This different handling of resume is just based on a hunch -
> > > why else wouldn't one reset the RBUF as well as the TBUF? If this
> > > isn't the case then it's easy to change the patch to make the RBUF
> > > reset unconditional.
> >
> > The real question is why is not the boot loader putting the GENET core
> > into a quasi power-on-reset state, since this is what Linux expects, and
> > also it seems the most conservative and prudent approach. Assuming the
> > RDMA and Unimac RX are disabled, otherwise we would happily continuing
> > to accept packets in DRAM, then the question is why is not the RBUF
> > flushed too, or is it flushed, but this is insufficient, if so, have we
> > determined why?
> >
> > >
> > > See: https://github.com/raspberrypi/linux/issues/3850
> > >
> > > Signed-off-by: Phil Elwell <phil@...pberrypi.com>
> > > Signed-off-by: Maarten Vanraes <maarten@...il.be>
> > > ---
> > > drivers/net/ethernet/broadcom/genet/bcmgenet.c | 16 ++++++++++++----
> > > 1 file changed, 12 insertions(+), 4 deletions(-)
> > >
> > > This patch fixes a problem on RPI 4B where in ~2/3 cases (if you're using
> > > nfsroot), you fail to boot; or at least the boot takes longer than
> > > 30 minutes.
> >
> > This makes me wonder whether this also fixes the issues that Maxime
> > reported a long time ago, which I can reproduce too, but have not been
> > able to track down the source of:
> >
> > https://lore.kernel.org/linux-kernel/20210706081651.diwks5meyaighx3e@gilmour/
> >
> > >
> > > Doing a simple ping revealed that when the ping starts working again
> > > (during the boot process), you have ping timings of ~1000ms, 2000ms or
> > > even 3000ms; while in normal cases it would be around 0.2ms.
> >
> > I would prefer that we find a way to better qualify whether a RBUF reset
> > is needed or not, but I suppose there is not any other way, since there
> > is an "RBUF enabled" bit that we can key off.
> >
> > Doug, what do you think?
> I agree that the Linux driver expects the GENET core to be in a "quasi
> power-on-reset state" and it seems likely that in both Maxime's case and
> the one identified here that is not the case. It would appear that the
> Raspberry Pi bootloader and/or "firmware" are likely not disabling the
> GENET receiver after loading the kernel image and before invoking the
> kernel. They may be disabling the DMA, but that is insufficient since
> any received data would likely overflow the RBUF leaving it in a "bad"
> state which this patch apparently improves.
>
> So it seems likely these issues are caused by improper
> bootloader/firmware behavior.
>
> That said, I suppose it would be nice if the driver were more robust.
> However, we both know how finicky the receive path of the GENET core can
> be about its initialization. Therefore, I am unwilling to "bless" this
> change for upstream without more due diligence on our side.
Could you please report back in a reasonable timeframe? The issue
addressed here looks like relevant, and the patch quite self-
encapsulated.
We can keep the path in PW meanwhile.
Thanks,
Paolo
Powered by blists - more mailing lists