lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <78d30a5b-0308-7fd5-1b2c-7e9aeef238d8@systec-electronic.com>
Date: Fri, 19 Jan 2024 07:30:46 +0100 (CET)
From: Andre Werner <andre.werner@...tec-electronic.com>
To: Heiner Kallweit <hkallweit1@...il.com>
cc: Andrew Lunn <andrew@...n.ch>, 
    Andre Werner <andre.werner@...tec-electronic.com>, davem@...emloft.net, 
    edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com, 
    linux@...linux.org.uk, netdev@...r.kernel.org, 
    linux-kernel@...r.kernel.org
Subject: Re: [PATCH] net: phy: adin1100: Fix nullptr exception for phy
 interrupts

On Thu, 18 Jan 2024, Heiner Kallweit wrote:

> On 18.01.2024 21:09, Andrew Lunn wrote:
>> On Thu, Jan 18, 2024 at 06:36:16PM +0100, Heiner Kallweit wrote:
>>> On 18.01.2024 17:53, Andrew Lunn wrote:
>>>> On Thu, Jan 18, 2024 at 11:43:41AM +0100, Andre Werner wrote:
>>>>> If using ADIN1100 as an external phy, e.g. in combination with
>>>>> "smsc95xx", we ran into nullptr exception by creating a link.
>>>>>
>>>>> In our case the "smsc95xx" does not check for an available interrupt handler
>>>>> on external phy driver to use poll instead of interrupts if no handler is
>>>>> available. So we decide to implement a small handler in the phy driver
>>>>> to support other MACs as well.
>>>>>
>>>>> I update the driver to add an interrupt handler because libphy
>>>>> does not check if their is a interrupt handler available either.
>>>>>
>>>>> There are several interrupts maskable at the phy, but only link change interrupts
>>>>> are handled by the driver yet.
>>>>>
>>>>> We tested the combination "smsc95xx" and "adin1100" with Linux Kernel 6.6.9
>>>>> and Linux Kernel 6.1.0, respectively.
>>>>
>>>> Hi Andre
>>>>
>>>> A few different things....
>>>>
>>>> Please could you give more details of the null pointer
>>>> exception. phylib should test if the needed methods have been
>>>> implemented in the PHY driver, and not tried to use interrupts when
>>>> they are missing. It should of polled the PHY. So i would like to
>>>> understand what went wrong. Maybe we have a phylib core bug we should
>>>> be fixing. Or a bug in the smsc95xx driver.
>>>>
>>> Seems to be a bug in smsc95xx. The following is the relevant code piece.
>>>
>>> ret = mdiobus_register(pdata->mdiobus);
>>> 	if (ret) {
>>> 		netdev_err(dev->net, "Could not register MDIO bus\n");
>>> 		goto free_mdio;
>>> 	}
>>>
>>> 	pdata->phydev = phy_find_first(pdata->mdiobus);
>>> 	if (!pdata->phydev) {
>>> 		netdev_err(dev->net, "no PHY found\n");
>>> 		ret = -ENODEV;
>>> 		goto unregister_mdio;
>>> 	}
>>>
>>> 	pdata->phydev->irq = phy_irq;
>>>
>>> The interrupt is set too late, after phy_probe(), where we have this:
>>>
>>> if (!phy_drv_supports_irq(phydrv) && phy_interrupt_is_valid(phydev))
>>> 		phydev->irq = PHY_POLL;
>>>
>>> So we would have two steps:
>>> 1. Fix the smsc95xx bug (as the same issue could occur with another PHY type)
>>
>> Its not so nice to fix.
>>
>> Normally you would do something like:
>>
>>         pdata->mdiobus->priv = dev;
>>         pdata->mdiobus->read = smsc95xx_mdiobus_read;
>>         pdata->mdiobus->write = smsc95xx_mdiobus_write;
>>         pdata->mdiobus->reset = smsc95xx_mdiobus_reset;
>>         pdata->mdiobus->name = "smsc95xx-mdiobus";
>>         pdata->mdiobus->parent = &dev->udev->dev;
>>
>>         snprintf(pdata->mdiobus->id, ARRAY_SIZE(pdata->mdiobus->id),
>>                  "usb-%03d:%03d", dev->udev->bus->busnum, dev->udev->devnum);
>>
>> 	pdata->mdiobus->irq[X] = phy_irq;
>>
>> 	ret = mdiobus_register(pdata->mdiobus);
>>
>> By setting pdata->mdiobus->irq[X] before registering the PHY, the irq
>> number gets passed to the phydev->irq very early on. If everything is
>> O.K, interrupts are then used.
>>
>> However, because of the use of phy_find_first(), we have no idea what
>> address on the bus the PHY is using. So we don't know which member of
>> irq[] to set. Its not ideal, but one solution is to set them all.
>>
>> However, a better solution is to perform the validation again in
>> phy_attach_direct(). Add a second:
>>
>> 	if (!phy_drv_supports_irq(phydrv) && phy_interrupt_is_valid(phydev))
>>         	phydev->irq = PHY_POLL;

I add this check into phy_device.c-> phy_attach_direct as a work around
for now. I will send a new patch set to net-next as suggested.

>>
> This would save us here, but can't prevent that phydev->irq may be set
> even later. I think, ideally nobody should ever access phydev->irq directly.
> There should be a setter which performs the needed checks.
> But it may be a longer journey to make parts of struct phy_device private
> to phylib.
>
>> which will force phydev->irq back to polling.
>>
>>       Andrew
>
> Heiner
>

Regards,

Andre

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ