lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 19 Mar 2024 09:56:24 -0700
From: Florian Fainelli <florian.fainelli@...adcom.com>
To: Maarten <maarten@...il.be>, Doug Berger <opendmb@...il.com>,
 netdev@...r.kernel.org,
 Broadcom internal kernel review list <bcm-kernel-feedback-list@...adcom.com>
Cc: Phil Elwell <phil@...pberrypi.com>
Subject: Re: [PATCH] net: bcmgenet: Reset RBUF on first open

On 3/16/24 04:53, Maarten wrote:
> Doug Berger schreef op 2024-02-27 00:13:
>> On 2/26/2024 9:34 AM, Florian Fainelli wrote:
>>> On 2/23/24 15:53, Maarten Vanraes wrote:
>>>> From: Phil Elwell <phil@...pberrypi.com>
>>>>
>>>> If the RBUF logic is not reset when the kernel starts then there
>>>> may be some data left over from any network boot loader. If the
>>>> 64-byte packet headers are enabled then this can be fatal.
>>>>
>>>> Extend bcmgenet_dma_disable to do perform the reset, but not when
>>>> called from bcmgenet_resume in order to preserve a wake packet.
>>>>
>>>> N.B. This different handling of resume is just based on a hunch -
>>>> why else wouldn't one reset the RBUF as well as the TBUF? If this
>>>> isn't the case then it's easy to change the patch to make the RBUF
>>>> reset unconditional.
>>>
>>> The real question is why is not the boot loader putting the GENET 
>>> core into a quasi power-on-reset state, since this is what Linux 
>>> expects, and also it seems the most conservative and prudent 
>>> approach. Assuming the RDMA and Unimac RX are disabled, otherwise we 
>>> would happily continuing to accept packets in DRAM, then the question 
>>> is why is not the RBUF flushed too, or is it flushed, but this is 
>>> insufficient, if so, have we determined why?
>>>
>>>>
>>>> See: https://github.com/raspberrypi/linux/issues/3850
>>>>
>>>> Signed-off-by: Phil Elwell <phil@...pberrypi.com>
>>>> Signed-off-by: Maarten Vanraes <maarten@...il.be>
>>>> ---
>>>>   drivers/net/ethernet/broadcom/genet/bcmgenet.c | 16 ++++++++++++----
>>>>   1 file changed, 12 insertions(+), 4 deletions(-)
>>>>
>>>> This patch fixes a problem on RPI 4B where in ~2/3 cases (if you're 
>>>> using
>>>> nfsroot), you fail to boot; or at least the boot takes longer than
>>>> 30 minutes.
>>>
>>> This makes me wonder whether this also fixes the issues that Maxime 
>>> reported a long time ago, which I can reproduce too, but have not 
>>> been able to track down the source of:
>>>
>>> https://lore.kernel.org/linux-kernel/20210706081651.diwks5meyaighx3e@gilmour/
>>>
>>>>
>>>> Doing a simple ping revealed that when the ping starts working again
>>>> (during the boot process), you have ping timings of ~1000ms, 2000ms or
>>>> even 3000ms; while in normal cases it would be around 0.2ms.
>>>
>>> I would prefer that we find a way to better qualify whether a RBUF 
>>> reset is needed or not, but I suppose there is not any other way, 
>>> since there is an "RBUF enabled" bit that we can key off.
>>>
>>> Doug, what do you think?
>> I agree that the Linux driver expects the GENET core to be in a "quasi
>> power-on-reset state" and it seems likely that in both Maxime's case
>> and the one identified here that is not the case. It would appear that
>> the Raspberry Pi bootloader and/or "firmware" are likely not disabling
>> the GENET receiver after loading the kernel image and before invoking
>> the kernel. They may be disabling the DMA, but that is insufficient
>> since any received data would likely overflow the RBUF leaving it in a
>> "bad" state which this patch apparently improves.
>>
>> So it seems likely these issues are caused by improper
>> bootloader/firmware behavior.
>>
>> That said, I suppose it would be nice if the driver were more robust.
>> However, we both know how finicky the receive path of the GENET core
>> can be about its initialization. Therefore, I am unwilling to "bless"
>> this change for upstream without more due diligence on our side.
> 
> Hey, did you guys have any chance to check this stuff out? any thoughts 
> on it?

We are both busy with higher priority work and I cannot see us being 
able to dedicate any time to this issue until April.

While we are sympathetic to your issue and you having upstreamed a fix 
for it, it is entirely self inflicted by having the VPU boot loader 
firmware not properly quiesce the GENET controller, at least based upon 
the description, therefore the natural fix should be... in the firmware.

 From my perspective: NAK.
-- 
Florian


Download attachment "smime.p7s" of type "application/pkcs7-signature" (4221 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ