[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6b88b4c5cf0ba7de2a639633ffbd5ceb@rmail.be>
Date: Tue, 19 Mar 2024 22:11:46 +0100
From: Maarten <maarten@...il.be>
To: Florian Fainelli <florian.fainelli@...adcom.com>,
netdev@...r.kernel.org, Broadcom internal kernel review list
<bcm-kernel-feedback-list@...adcom.com>
Cc: Doug Berger <opendmb@...il.com>, Phil Elwell <phil@...pberrypi.com>
Subject: Re: [PATCH] net: bcmgenet: Reset RBUF on first open
Florian Fainelli schreef op 2024-03-19 17:56:
> On 3/16/24 04:53, Maarten wrote:
>> Doug Berger schreef op 2024-02-27 00:13:
>>> On 2/26/2024 9:34 AM, Florian Fainelli wrote:
>>>> On 2/23/24 15:53, Maarten Vanraes wrote:
>>>>> From: Phil Elwell <phil@...pberrypi.com>
>>>>>
>>>>> If the RBUF logic is not reset when the kernel starts then there
>>>>> may be some data left over from any network boot loader. If the
>>>>> 64-byte packet headers are enabled then this can be fatal.
>>>>>
>>>>> Extend bcmgenet_dma_disable to do perform the reset, but not when
>>>>> called from bcmgenet_resume in order to preserve a wake packet.
>>>>>
>>>>> N.B. This different handling of resume is just based on a hunch -
>>>>> why else wouldn't one reset the RBUF as well as the TBUF? If this
>>>>> isn't the case then it's easy to change the patch to make the RBUF
>>>>> reset unconditional.
>>>>
>>>> The real question is why is not the boot loader putting the GENET
>>>> core into a quasi power-on-reset state, since this is what Linux
>>>> expects, and also it seems the most conservative and prudent
>>>> approach. Assuming the RDMA and Unimac RX are disabled, otherwise we
>>>> would happily continuing to accept packets in DRAM, then the
>>>> question is why is not the RBUF flushed too, or is it flushed, but
>>>> this is insufficient, if so, have we determined why?
>>>>
>>>>>
>>>>> See: https://github.com/raspberrypi/linux/issues/3850
>>>>>
>>>>> Signed-off-by: Phil Elwell <phil@...pberrypi.com>
>>>>> Signed-off-by: Maarten Vanraes <maarten@...il.be>
>>>>> ---
>>>>> drivers/net/ethernet/broadcom/genet/bcmgenet.c | 16
>>>>> ++++++++++++----
>>>>> 1 file changed, 12 insertions(+), 4 deletions(-)
>>>>>
>>>>> This patch fixes a problem on RPI 4B where in ~2/3 cases (if you're
>>>>> using
>>>>> nfsroot), you fail to boot; or at least the boot takes longer than
>>>>> 30 minutes.
>>>>
>>>> This makes me wonder whether this also fixes the issues that Maxime
>>>> reported a long time ago, which I can reproduce too, but have not
>>>> been able to track down the source of:
>>>>
>>>> https://lore.kernel.org/linux-kernel/20210706081651.diwks5meyaighx3e@gilmour/
>>>>
>>>>>
>>>>> Doing a simple ping revealed that when the ping starts working
>>>>> again
>>>>> (during the boot process), you have ping timings of ~1000ms, 2000ms
>>>>> or
>>>>> even 3000ms; while in normal cases it would be around 0.2ms.
>>>>
>>>> I would prefer that we find a way to better qualify whether a RBUF
>>>> reset is needed or not, but I suppose there is not any other way,
>>>> since there is an "RBUF enabled" bit that we can key off.
>>>>
>>>> Doug, what do you think?
>>> I agree that the Linux driver expects the GENET core to be in a
>>> "quasi
>>> power-on-reset state" and it seems likely that in both Maxime's case
>>> and the one identified here that is not the case. It would appear
>>> that
>>> the Raspberry Pi bootloader and/or "firmware" are likely not
>>> disabling
>>> the GENET receiver after loading the kernel image and before invoking
>>> the kernel. They may be disabling the DMA, but that is insufficient
>>> since any received data would likely overflow the RBUF leaving it in
>>> a
>>> "bad" state which this patch apparently improves.
>>>
>>> So it seems likely these issues are caused by improper
>>> bootloader/firmware behavior.
>>>
>>> That said, I suppose it would be nice if the driver were more robust.
>>> However, we both know how finicky the receive path of the GENET core
>>> can be about its initialization. Therefore, I am unwilling to "bless"
>>> this change for upstream without more due diligence on our side.
>>
>> Hey, did you guys have any chance to check this stuff out? any
>> thoughts on it?
>
> We are both busy with higher priority work and I cannot see us being
> able to dedicate any time to this issue until April.
>
> While we are sympathetic to your issue and you having upstreamed a fix
> for it, it is entirely self inflicted by having the VPU boot loader
> firmware not properly quiesce the GENET controller, at least based
> upon the description, therefore the natural fix should be... in the
> firmware.
I totally agree that the natural fix should be in the firmware.
> From my perspective: NAK.
Fair enough, though I do think that there are often workarounds for
faulty firmware, and making the driver more robust is not a bad thing
either.
In any case, I try to raise this issue with the raspberry pi people in
the hopes of fixing this in the proper place.
Thanks for the response,
Maarten
Powered by blists - more mailing lists