lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 26 Feb 2021 11:08:43 +0100
From:   Heiner Kallweit <hkallweit1@...il.com>
To:     Daniel González Cabanelas <dgcbueu@...il.com>
Cc:     Andrew Lunn <andrew@...n.ch>,
        Florian Fainelli <f.fainelli@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>, gregkh@...uxfoundation.org,
        netdev@...r.kernel.org,
        Álvaro Fernández Rojas <noltari@...il.com>
Subject: Re: [PATCH v2] bcm63xx_enet: fix internal phy IRQ assignment

On 26.02.2021 10:49, Daniel González Cabanelas wrote:
> El vie, 26 feb 2021 a las 10:32, Heiner Kallweit
> (<hkallweit1@...il.com>) escribió:
>>
>> On 26.02.2021 10:10, Daniel González Cabanelas wrote:
>>> El vie, 26 feb 2021 a las 8:13, Heiner Kallweit
>>> (<hkallweit1@...il.com>) escribió:
>>>>
>>>> On 25.02.2021 23:28, Daniel González Cabanelas wrote:
>>>>> El jue, 25 feb 2021 a las 21:05, Heiner Kallweit
>>>>> (<hkallweit1@...il.com>) escribió:
>>>>>>
>>>>>> On 25.02.2021 17:36, Daniel González Cabanelas wrote:
>>>>>>> El jue, 25 feb 2021 a las 8:22, Heiner Kallweit
>>>>>>> (<hkallweit1@...il.com>) escribió:
>>>>>>>>
>>>>>>>> On 25.02.2021 00:54, Daniel González Cabanelas wrote:
>>>>>>>>> El mié, 24 feb 2021 a las 23:01, Florian Fainelli
>>>>>>>>> (<f.fainelli@...il.com>) escribió:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2/24/2021 1:44 PM, Heiner Kallweit wrote:
>>>>>>>>>>> On 24.02.2021 16:44, Daniel González Cabanelas wrote:
>>>>>>>>>>>> The current bcm63xx_enet driver doesn't asign the internal phy IRQ. As a
>>>>>>>>>>>> result of this it works in polling mode.
>>>>>>>>>>>>
>>>>>>>>>>>> Fix it using the phy_device structure to assign the platform IRQ.
>>>>>>>>>>>>
>>>>>>>>>>>> Tested under a BCM6348 board. Kernel dmesg before the patch:
>>>>>>>>>>>>    Broadcom BCM63XX (1) bcm63xx_enet-0:01: attached PHY driver [Broadcom
>>>>>>>>>>>>               BCM63XX (1)] (mii_bus:phy_addr=bcm63xx_enet-0:01, irq=POLL)
>>>>>>>>>>>>
>>>>>>>>>>>> After the patch:
>>>>>>>>>>>>    Broadcom BCM63XX (1) bcm63xx_enet-0:01: attached PHY driver [Broadcom
>>>>>>>>>>>>               BCM63XX (1)] (mii_bus:phy_addr=bcm63xx_enet-0:01, irq=17)
>>>>>>>>>>>>
>>>>>>>>>>>> Pluging and uplugging the ethernet cable now generates interrupts and the
>>>>>>>>>>>> PHY goes up and down as expected.
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Daniel González Cabanelas <dgcbueu@...il.com>
>>>>>>>>>>>> ---
>>>>>>>>>>>> changes in V2:
>>>>>>>>>>>>   - snippet moved after the mdiobus registration
>>>>>>>>>>>>   - added missing brackets
>>>>>>>>>>>>
>>>>>>>>>>>>  drivers/net/ethernet/broadcom/bcm63xx_enet.c | 13 +++++++++++--
>>>>>>>>>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/drivers/net/ethernet/broadcom/bcm63xx_enet.c b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
>>>>>>>>>>>> index fd876721316..dd218722560 100644
>>>>>>>>>>>> --- a/drivers/net/ethernet/broadcom/bcm63xx_enet.c
>>>>>>>>>>>> +++ b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
>>>>>>>>>>>> @@ -1818,10 +1818,19 @@ static int bcm_enet_probe(struct platform_device *pdev)
>>>>>>>>>>>>               * if a slave is not present on hw */
>>>>>>>>>>>>              bus->phy_mask = ~(1 << priv->phy_id);
>>>>>>>>>>>>
>>>>>>>>>>>> -            if (priv->has_phy_interrupt)
>>>>>>>>>>>> +            ret = mdiobus_register(bus);
>>>>>>>>>>>> +
>>>>>>>>>>>> +            if (priv->has_phy_interrupt) {
>>>>>>>>>>>> +                    phydev = mdiobus_get_phy(bus, priv->phy_id);
>>>>>>>>>>>> +                    if (!phydev) {
>>>>>>>>>>>> +                            dev_err(&dev->dev, "no PHY found\n");
>>>>>>>>>>>> +                            goto out_unregister_mdio;
>>>>>>>>>>>> +                    }
>>>>>>>>>>>> +
>>>>>>>>>>>>                      bus->irq[priv->phy_id] = priv->phy_interrupt;
>>>>>>>>>>>> +                    phydev->irq = priv->phy_interrupt;
>>>>>>>>>>>> +            }
>>>>>>>>>>>>
>>>>>>>>>>>> -            ret = mdiobus_register(bus);
>>>>>>>>>>>
>>>>>>>>>>> You shouldn't have to set phydev->irq, this is done by phy_device_create().
>>>>>>>>>>> For this to work bus->irq[] needs to be set before calling mdiobus_register().
>>>>>>>>>>
>>>>>>>>>> Yes good point, and that is what the unchanged code does actually.
>>>>>>>>>> Daniel, any idea why that is not working?
>>>>>>>>>
>>>>>>>>> Hi Florian, I don't know. bus->irq[] has no effect, only assigning the
>>>>>>>>> IRQ through phydev->irq works.
>>>>>>>>>
>>>>>>>>> I can resend the patch  without the bus->irq[] line since it's
>>>>>>>>> pointless in this scenario.
>>>>>>>>>
>>>>>>>>
>>>>>>>> It's still an ugly workaround and a proper root cause analysis should be done
>>>>>>>> first. I can only imagine that phydev->irq is overwritten in phy_probe()
>>>>>>>> because phy_drv_supports_irq() is false. Can you please check whether
>>>>>>>> phydev->irq is properly set in phy_device_create(), and if yes, whether
>>>>>>>> it's reset to PHY_POLL in phy_probe()?.
>>>>>>>>
>>>>>>>
>>>>>>> Hi Heiner, I added some kernel prints:
>>>>>>>
>>>>>>> [    2.712519] libphy: Fixed MDIO Bus: probed
>>>>>>> [    2.721969] =======phy_device_create===========
>>>>>>> [    2.726841] phy_device_create: dev->irq = 17
>>>>>>> [    2.726841]
>>>>>>> [    2.832620] =======phy_probe===========
>>>>>>> [    2.836846] phy_probe: phydev->irq = 17
>>>>>>> [    2.840950] phy_probe: phy_drv_supports_irq = 0, phy_interrupt_is_valid = 1
>>>>>>> [    2.848267] phy_probe: phydev->irq = -1
>>>>>>> [    2.848267]
>>>>>>> [    2.854059] =======phy_probe===========
>>>>>>> [    2.858174] phy_probe: phydev->irq = -1
>>>>>>> [    2.862253] phy_probe: phydev->irq = -1
>>>>>>> [    2.862253]
>>>>>>> [    2.868121] libphy: bcm63xx_enet MII bus: probed
>>>>>>> [    2.873320] Broadcom BCM63XX (1) bcm63xx_enet-0:01: attached PHY
>>>>>>> driver [Broadcom BCM63XX (1)] (mii_bus:phy_addr=bcm63xx_enet-0:01,
>>>>>>> irq=POLL)
>>>>>>>
>>>>>>> Currently using kernel 5.4.99. I still have no idea what's going on.
>>>>>>>
>>>>>> Thanks for debugging. This confirms my assumption that the interrupt
>>>>>> is overwritten in phy_probe(). I'm just scratching my head how
>>>>>> phy_drv_supports_irq() can return 0. In 5.4.99 it's defined as:
>>>>>>
>>>>>> static bool phy_drv_supports_irq(struct phy_driver *phydrv)
>>>>>> {
>>>>>>         return phydrv->config_intr && phydrv->ack_interrupt;
>>>>>> }
>>>>>>
>>>>>> And that's the PHY driver:
>>>>>>
>>>>>> static struct phy_driver bcm63xx_driver[] = {
>>>>>> {
>>>>>>         .phy_id         = 0x00406000,
>>>>>>         .phy_id_mask    = 0xfffffc00,
>>>>>>         .name           = "Broadcom BCM63XX (1)",
>>>>>>         /* PHY_BASIC_FEATURES */
>>>>>>         .flags          = PHY_IS_INTERNAL,
>>>>>>         .config_init    = bcm63xx_config_init,
>>>>>>         .ack_interrupt  = bcm_phy_ack_intr,
>>>>>>         .config_intr    = bcm63xx_config_intr,
>>>>>> }
>>>>>>
>>>>>> So both callbacks are set. Can you extend your debugging and check
>>>>>> in phy_drv_supports_irq() which of the callbacks is missing?
>>>>>>
>>>>>
>>>>> Hi, both callbacks are missing on the first check. However on the next
>>>>> calls they're there.
>>>>>
>>>>> [    2.263909] libphy: Fixed MDIO Bus: probed
>>>>> [    2.273026] =======phy_device_create===========
>>>>> [    2.277908] phy_device_create: dev->irq = 17
>>>>> [    2.277908]
>>>>> [    2.373104] =======phy_probe===========
>>>>> [    2.377336] phy_probe: phydev->irq = 17
>>>>> [    2.381445] phy_drv_supports_irq: phydrv->config_intr = 0,
>>>>> phydrv->ack_interrupt = 0
>>>>> [    2.389554] phydev->irq = PHY_POLL;
>>>>> [    2.393186] phy_probe: phydev->irq = -1
>>>>> [    2.393186]
>>>>> [    2.398987] =======phy_probe===========
>>>>> [    2.403108] phy_probe: phydev->irq = -1
>>>>> [    2.407195] phy_drv_supports_irq: phydrv->config_intr = 1,
>>>>> phydrv->ack_interrupt = 1
>>>>> [    2.415314] phy_probe: phydev->irq = -1
>>>>> [    2.415314]
>>>>> [    2.421189] libphy: bcm63xx_enet MII bus: probed
>>>>> [    2.426129] =======phy_connect===========
>>>>> [    2.430410] phy_drv_supports_irq: phydrv->config_intr = 1,
>>>>> phydrv->ack_interrupt = 1
>>>>> [    2.438537] phy_connect: phy_drv_supports_irq = 1
>>>>> [    2.438537]
>>>>> [    2.445284] Broadcom BCM63XX (1) bcm63xx_enet-0:01: attached PHY
>>>>> driver [Broadcom BCM63XX (1)] (mii_bus:phy_addr=bcm63xx_enet-0:01,
>>>>> irq=POLL)
>>>>>
>>>>
>>>> I'd like to understand why the phy_device is probed twice,
>>>> with which drivers it's probed.
>>>> Could you please add printing phydrv->name to phy_probe() ?
>>>>
>>>
>>> Hi Heiner, indeed there are two different probed devices. The B53
>>> switch driver is causing this issue.
>>>
>>> [    2.269595] libphy: Fixed MDIO Bus: probed
>>> [    2.278706] =======phy_device_create===========
>>> [    2.283594] phy_device_create: dev->irq = 17
>>> [    2.283594]
>>> [    2.379554] =======phy_probe===========
>>> [    2.383780] phy_probe: phydrv->name = Broadcom B53 (3)
>>
>> Is this an out-of-tree driver? I can't find this string in any
>> DSA or PHY driver.
>>
> 
> Yes it is.
> https://github.com/openwrt/openwrt/blob/master/target/linux/generic/files/drivers/net/phy/b53/b53_mdio.c#L421
> 

OK, I see. Then there's no reason to complain upstream.
Either use the mainline B53 DSA driver of fix interrupt mode
downstream.

>>
>>> [    2.389235] phy_probe: phydev->irq = 17
>>> [    2.393332] phy_drv_supports_irq: phydrv->config_intr = 0,
>>> phydrv->ack_interrupt = 0
>>> [    2.401445] phydev->irq = PHY_POLL
>>> [    2.405080] phy_probe: phydev->irq = -1
>>> [    2.405080]
>>> [    2.410878] =======phy_probe===========
>>> [    2.414996] phy_probe: phydrv->name = Broadcom BCM63XX (1)
>>> [    2.420791] phy_probe: phydev->irq = -1
>>> [    2.424876] phy_drv_supports_irq: phydrv->config_intr = 1,
>>> phydrv->ack_interrupt = 1
>>> [    2.432994] phy_probe: phydev->irq = -1
>>> [    2.432994]
>>> [    2.438862] libphy: bcm63xx_enet MII bus: probed
>>> [    2.443809] =======phy_connect===========
>>> [    2.448092] phy_drv_supports_irq: phydrv->config_intr = 1,
>>> phydrv->ack_interrupt = 1
>>> [    2.456215] phy_connect: phy_drv_supports_irq = 1
>>> [    2.456215]
>>> [    2.462961] Broadcom BCM63XX (1) bcm63xx_enet-0:01: attached PHY
>>> driver [Broadcom BCM63XX (1)] (mii_bus:phy_addr=bcm63xx_enet-0:01,
>>> irq=POLL)
>>>
>>> The board has no switch, it's a driver for other boards in OpenWrt. I
>>> forgot it wasn't upstreamed:
>>> https://github.com/openwrt/openwrt/tree/master/target/linux/generic/files/drivers/net/phy/b53
>>>
>>> I tested a kernel compiled without this driver, now the IRQ is
>>> detected as it should be:
>>>
>>> [    2.270707] libphy: Fixed MDIO Bus: probed
>>> [    2.279715] =======phy_device_create===========
>>> [    2.284600] phy_device_create: dev->irq = 17
>>> [    2.284600]
>>> [    2.373763] =======phy_probe===========
>>> [    2.377989] phy_probe: phydrv->name = Broadcom BCM63XX (1)
>>> [    2.383803] phy_probe: phydev->irq = 17
>>> [    2.387888] phy_drv_supports_irq: phydrv->config_intr = 1,
>>> phydrv->ack_interrupt = 1
>>> [    2.396007] phy_probe: phydev->irq = 17
>>> [    2.396007]
>>> [    2.401877] libphy: bcm63xx_enet MII bus: probed
>>> [    2.406820] =======phy_connect===========
>>> [    2.411099] phy_drv_supports_irq: phydrv->config_intr = 1,
>>> phydrv->ack_interrupt = 1
>>> [    2.419226] phy_connect: phy_drv_supports_irq = 1
>>> [    2.419226]
>>> [    2.429857] Broadcom BCM63XX (1) bcm63xx_enet-0:01: attached PHY
>>> driver [Broadcom BCM63XX (1)] (mii_bus:phy_addr=bcm63xx_enet-0:01,
>>> irq=17)
>>>
>>> Then, maybe this is an OpenWrt bug itself?
>>>
>>> Regards
>>> Daniel
>>>
>>>>
>>>>> I also added the prints to phy_connect.
>>>>>
>>>>>> Last but not least: Do you use a mainline kernel, or is it maybe
>>>>>> a modified downstream kernel? In the latter case, please check
>>>>>> in your kernel sources whether both callbacks are set.
>>>>>>
>>>>>
>>>>> It's a modified kernel, and the the callbacks are set. BTW I also
>>>>> tested the kernel with no patches concerning to the ethernet driver.
>>>>>
>>>>> Regards,
>>>>> Daniel
>>>>>
>>>>>>
>>>>>>
>>>>>>>> On which kernel version do you face this problem?
>>>>>>>>
>>>>>>> The kernel version 4.4 works ok. The minimum version where I found the
>>>>>>> problem were the kernel 4.9.111, now using 5.4. And 5.10 also tested.
>>>>>>>
>>>>>>> Regards
>>>>>>> Daniel
>>>>>>>
>>>>>>>>> Regards
>>>>>>>>>> --
>>>>>>>>>> Florian
>>>>>>>>
>>>>>>
>>>>
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ