lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 22 Nov 2018 20:48:38 +0100
From:   Heiner Kallweit <hkallweit1@...il.com>
To:     Andrew Lunn <andrew@...n.ch>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>
Cc:     Marc Dionne <marc.c.dionne@...il.com>, norbert.jurkeit@....de,
        nic_swsd@...ltek.com, Florian Fainelli <f.fainelli@...il.com>,
        David Miller <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        michael.wiktowy@...il.com, jcline@...hat.com
Subject: Re: Issue with RTL8111 NIC after upgrade to kernel 4.19

On 22.11.2018 19:57, Andrew Lunn wrote:
>> Thanks a lot for testing. Could you please test also the following
>> as an alternative to the delay?
>>
>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>> index 55202a0ac..aeccb2323 100644
>> --- a/drivers/net/phy/phy_device.c
>> +++ b/drivers/net/phy/phy_device.c
>> @@ -2254,6 +2254,7 @@ int phy_driver_register(struct phy_driver *new_driver, struct module *owner)
>>         new_driver->mdiodrv.driver.probe = phy_probe;
>>         new_driver->mdiodrv.driver.remove = phy_remove;
>>         new_driver->mdiodrv.driver.owner = owner;
>> +       new_driver->mdiodrv.driver.probe_type = PROBE_FORCE_SYNCHRONOUS;
>>
>>         retval = driver_register(&new_driver->mdiodrv.driver);
>>         if (retval) {
> 
> 
> Humm, maybe i don't understand the issue correctly.
> 
> When the MDIO bus is registered, we scan the bus looking for PHYs.
> When we find a PHY, we call phy_device_create(). That will then
> trigger the loading of the kernel module which should driver this phy.
> 
> Sometime later, the PHY driver module gets loaded and calls
> phy_drivers_register() to register the list of IDs it supports.  The
> driver core will then call phy_bus_match() to see if the newly loaded
> driver matches to a device we have created. If so, the driver will be
> associated to the device.
> 
request_module() is synchronous and should return only once the init
function of the loaded module has been executed. Means in case of
phylib: all PHY drivers of the module have been registered

> Sometime later, the MAC tries to attach to the phy using
> phy_attach_direct(). If there is no driver associated to the device,
> we use the generic PHY driver.
> 
> I thought the issue was the race condition between loading the module
> and the MAC attaching to it? We are getting the generic driver because
> the specific driver is still loading?
> 
No. The issue happens before: For whatever reason the PHY driver doesn't
bind to the PHY device, even though they match. Therefore
phy_attach_direct() then binds the genphy driver.

In theory it shouldn't make a difference whether device or driver is
registered first. When a driver is registered it checks all unbound
devices on the same bus for a match.

Standard is asynchronous driver probing, therefore in case of phylib
(in at least a lot of cases) the driver probes the devices on the bus
once the PHY device was registered. This seems to fail and I have no
idea why.

It has been reported that loading the PHY driver module upfront fixes
the issue. Therefore I assumed it may help to let the PHY driver probe
for devices earlier (even though the device isn't even registered yet).
And indeed (see mail just sent by Marc) using synchronous probing
fixes the issue.
It seems to me that probing triggered from device registering and
probing triggered from driver registering somehow interfere.

Of course it's not very satisfying to have a fix but to not understand
the root cause of the issue. I add Greg and Rafael as maintainers of
the driver core to the discussion. Maybe they can shed some light on
the situation.

> If that really is the issue, i think phy_attach_direct() should try
> loading the module again, doing it synchronously. Only when it fails
> should the generic driver be associated to the device.
> 
>        Andrew
> 
> 
Heiner

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ