lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 18 Apr 2020 14:06:21 +0300
From:   Lauri Jakku <lauri.jakku@...inet.fi>
To:     Heiner Kallweit <hkallweit1@...il.com>
Cc:     Leon Romanovsky <leon@...nel.org>, netdev@...r.kernel.org,
        nic_swsd@...ltek.com
Subject: Re: NET: r8168/r8169 identifying fix

Hi,

On 17.4.2020 10.30, Lauri Jakku wrote:
> Hi,
>
> On 17.4.2020 9.23, Lauri Jakku wrote:
>>
>> On 16.4.2020 23.50, Heiner Kallweit wrote:
>>> On 16.04.2020 22:38, Lauri Jakku wrote:
>>>> Hi
>>>>
>>>> On 16.4.2020 23.10, Lauri Jakku wrote:
>>>>> On 16.4.2020 23.02, Heiner Kallweit wrote:
>>>>>> On 16.04.2020 21:58, Lauri Jakku wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 16.4.2020 21.37, Lauri Jakku wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On 16.4.2020 21.26, Heiner Kallweit wrote:
>>>>>>>>> On 16.04.2020 13:30, Lauri Jakku wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 5.6.3-2-MANJARO: stock manjaro kernel, without modifications 
>>>>>>>>>> --> network does not work
>>>>>>>>>>
>>>>>>>>>> 5.6.3-2-MANJARO-lja: No attach check, modified kernel (r8169 
>>>>>>>>>> mods only) --> network does not work
>>>>>>>>>>
>>>>>>>>>> 5.6.3-2-MANJARO-with-the-r8169-patch: phy patched + r8169 
>>>>>>>>>> mods -> devices show up ok, network works
>>>>>>>>>>
>>>>>>>>>> All different initcpio's have realtek.ko in them.
>>>>>>>>>>
>>>>>>>>> Thanks for the logs. Based on the logs you're presumable 
>>>>>>>>> affected by a known BIOS bug.
>>>>>>>>> Check bug tickets 202275 and 207203 at bugzilla.kernel.org.
>>>>>>>>> In the first referenced tickets it's about the same mainboard 
>>>>>>>>> (with earlier BIOS version).
>>>>>>>>> BIOS on this mainboard seems to not initialize the network 
>>>>>>>>> chip / PHY correctly, it reports
>>>>>>>>> a random number as PHY ID, resulting in no PHY driver being 
>>>>>>>>> found.
>>>>>>>>> Enable "Onboard LAN Boot ROM" in the BIOS, and your problem 
>>>>>>>>> should be gone.
>>>>>>>>>
>>>>>>>> OK, I try that, thank you :)
>>>>>>>>
>>>>>>> It seems that i DO have the ROM's enabled, i'm now testing some 
>>>>>>> mutex guard for phy state and try to use it as indicator
>>>>>>>
>>>>>>> that attach has been done. One thing i've noticed is that driver 
>>>>>>> needs to be reloaded to allow traffic (ie. ping works etc.)
>>>>>>>
>>>>>> All that shouldn't be needed. Just check with which PHY ID the 
>>>>>> PHY comes up.
>>>>>> And what do you mean with "it seems"? Is the option enabled or not?
>>>>>>
>>>>> I do have ROM's enabled, and it does not help with my issue.
>>> Your BIOS is a beta version, downgrading to F7 may help. Then you 
>>> have the same
>>> mainboard + BIOS as the user who opened bug ticket 202275.
>>>
>> huhti 17 09:01:49 MinistryOfSillyWalk kernel: r8169 0000:03:00.0: PHY 
>> version: 0xc2077002
>> huhti 17 09:01:49 MinistryOfSillyWalk kernel: r8169 0000:03:00.0: MAC 
>> version: 23
>>
>> ....
>>
>> huhti 17 09:03:29 MinistryOfSillyWalk kernel: r8169 0000:02:00.0: PHY 
>> version: 0x1cc912
>>
>> huhti 17 09:03:29 MinistryOfSillyWalk kernel: r8169 0000:02:00.0: MAC 
>> version: 23
>>
>> .. after module unload & load cycle:
>>
>> huhti 17 09:17:35 MinistryOfSillyWalk kernel: r8169 0000:02:00.0: PHY 
>> version: 0x1cc912
>> huhti 17 09:17:35 MinistryOfSillyWalk kernel: r8169 0000:02:00.0: MAC 
>> version: 23
>>
>>
>> it seem to be the case that the phy_id chances onetime, then stays 
>> the same. I'll do few shutdowns and see
>>
>> is there a pattern at all .. next i'm going to try how it behaves, if 
>> i read mac/phy versions twice on MAC version 23.
>>
>>
>> The BIOS downgrade: I'd like to solve this without downgrading BIOS. 
>> If I can't, then I'll do systemd-service that
>>
>> reloads r8169 driver at boot, cause then network is just fine.
>>
>>
> What i've gathered samples now, there is three values for PHY ID:
>
> [sillyme@...istryOfSillyWalk KernelStuff]$ sudo journalctl |grep "PHY 
> ver"
> huhti 17 09:01:49 MinistryOfSillyWalk kernel: r8169 0000:02:00.0: PHY 
> version: 0xc2077002
> huhti 17 09:01:49 MinistryOfSillyWalk kernel: r8169 0000:03:00.0: PHY 
> version: 0xc2077002
> huhti 17 09:03:29 MinistryOfSillyWalk kernel: r8169 0000:02:00.0: PHY 
> version: 0x1cc912
> huhti 17 09:03:29 MinistryOfSillyWalk kernel: r8169 0000:03:00.0: PHY 
> version: 0x1cc912
> huhti 17 09:17:35 MinistryOfSillyWalk kernel: r8169 0000:02:00.0: PHY 
> version: 0x1cc912
> huhti 17 09:17:35 MinistryOfSillyWalk kernel: r8169 0000:03:00.0: PHY 
> version: 0x1cc912
> huhti 17 09:24:53 MinistryOfSillyWalk kernel: r8169 0000:02:00.0: PHY 
> version: 0xc1071002
> huhti 17 09:24:53 MinistryOfSillyWalk kernel: r8169 0000:03:00.0: PHY 
> version: 0xc1071002
> huhti 17 09:27:59 MinistryOfSillyWalk kernel: r8169 0000:02:00.0: PHY 
> version: 0x1cc912
> huhti 17 09:27:59 MinistryOfSillyWalk kernel: r8169 0000:03:00.0: PHY 
> version: 0x1cc912
> huhti 17 10:08:42 MinistryOfSillyWalk kernel: r8169 0000:02:00.0: PHY 
> version: 0xc1071002
> huhti 17 10:08:42 MinistryOfSillyWalk kernel: r8169 0000:03:00.0: PHY 
> version: 0xc1071002
> huhti 17 10:12:07 MinistryOfSillyWalk kernel: r8169 0000:02:00.0: PHY 
> version: 0x1cc912
> huhti 17 10:12:07 MinistryOfSillyWalk kernel: r8169 0000:03:00.0: PHY 
> version: 0x1cc912
> huhti 17 10:20:35 MinistryOfSillyWalk kernel: r8169 0000:02:00.0: PHY 
> version: 0xc1071002
> huhti 17 10:20:35 MinistryOfSillyWalk kernel: r8169 0000:03:00.0: PHY 
> version: 0xc1071002
> huhti 17 10:23:46 MinistryOfSillyWalk kernel: r8169 0000:02:00.0: PHY 
> version: 0x1cc912
> huhti 17 10:23:46 MinistryOfSillyWalk kernel: r8169 0000:03:00.0: PHY 
> version: 0x1cc912
>
> I dont know are those hard coded or what, and are they device specific 
> how much.
>
> i haven't coldbooted things up, that may be that something to check do 
> they vary how per coldboot.
>
>>>> I check the ID, and revert all other changes, and check how it is 
>>>> working after adding the PHY id to list.
>>>>
What i've now learned: the patch + script + journald services -> Results 
working network, but it is still a workaround.


>>>>>>>>>> The problem with old method seems to be, that device does not 
>>>>>>>>>> have had time to attach before the
>>>>>>>>>> PHY driver check.
>>>>>>>>>>
>>>>>>>>>> The patch:
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/net/ethernet/realtek/r8169_main.c 
>>>>>>>>>> b/drivers/net/ethernet/realtek/r8169_main.c
>>>>>>>>>> index bf5bf05970a2..acd122a88d4a 100644
>>>>>>>>>> --- a/drivers/net/ethernet/realtek/r8169_main.c
>>>>>>>>>> +++ b/drivers/net/ethernet/realtek/r8169_main.c
>>>>>>>>>> @@ -5172,11 +5172,11 @@ static int r8169_mdio_register(struct 
>>>>>>>>>> rtl8169_private *tp)
>>>>>>>>>>            if (!tp->phydev) {
>>>>>>>>>>                    mdiobus_unregister(new_bus);
>>>>>>>>>>                    return -ENODEV;
>>>>>>>>>> -       } else if (!tp->phydev->drv) {
>>>>>>>>>> +       } else if (tp->mac_version == RTL_GIGA_MAC_NONE) {
>>>>>>>>>>                    /* Most chip versions fail with the genphy 
>>>>>>>>>> driver.
>>>>>>>>>>                     * Therefore ensure that the dedicated PHY 
>>>>>>>>>> driver is loaded.
>>>>>>>>>>                     */
>>>>>>>>>> -               dev_err(&pdev->dev, "realtek.ko not loaded, 
>>>>>>>>>> maybe it needs to be added to initramfs?\n");
>>>>>>>>>> +               dev_err(&pdev->dev, "Not known MAC version.\n");
>>>>>>>>>>                    mdiobus_unregister(new_bus);
>>>>>>>>>>                    return -EUNATCH;
>>>>>>>>>>            }
>>>>>>>>>> diff --git a/drivers/net/phy/phy-core.c 
>>>>>>>>>> b/drivers/net/phy/phy-core.c
>>>>>>>>>> index 66b8c61ca74c..aba2b304b821 100644
>>>>>>>>>> --- a/drivers/net/phy/phy-core.c
>>>>>>>>>> +++ b/drivers/net/phy/phy-core.c
>>>>>>>>>> @@ -704,6 +704,10 @@ EXPORT_SYMBOL_GPL(phy_modify_mmd);
>>>>>>>>>>       static int __phy_read_page(struct phy_device *phydev)
>>>>>>>>>>     {
>>>>>>>>>> +       /* If not attached, do nothing (no warning) */
>>>>>>>>>> +       if (!phydev->attached_dev)
>>>>>>>>>> +               return -EOPNOTSUPP;
>>>>>>>>>> +
>>>>>>>>>>            if (WARN_ONCE(!phydev->drv->read_page, "read_page 
>>>>>>>>>> callback not available, PHY driver not loaded?\n"))
>>>>>>>>>>                    return -EOPNOTSUPP;
>>>>>>>>>>     @@ -712,12 +716,17 @@ static int __phy_read_page(struct 
>>>>>>>>>> phy_device *phydev)
>>>>>>>>>>       static int __phy_write_page(struct phy_device *phydev, 
>>>>>>>>>> int page)
>>>>>>>>>>     {
>>>>>>>>>> +       /* If not attached, do nothing (no warning) */
>>>>>>>>>> +       if (!phydev->attached_dev)
>>>>>>>>>> +               return -EOPNOTSUPP;
>>>>>>>>>> +
>>>>>>>>>>            if (WARN_ONCE(!phydev->drv->write_page, 
>>>>>>>>>> "write_page callback not available, PHY driver not loaded?\n"))
>>>>>>>>>>                    return -EOPNOTSUPP;
>>>>>>>>>>              return phydev->drv->write_page(phydev, page);
>>>>>>>>>>     }
>>>>>>>>>>     +
>>>>>>>>>>     /**
>>>>>>>>>>      * phy_save_page() - take the bus lock and save the 
>>>>>>>>>> current page
>>>>>>>>>>      * @phydev: a pointer to a &struct phy_device
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 15. huhtik. 2020, 19.18, Heiner Kallweit 
>>>>>>>>>> <hkallweit1@...il.com <mailto:hkallweit1@...il.com>> kirjoitti:
>>>>>>>>>>
>>>>>>>>>>        On 15.04.2020 16:39, Lauri Jakku wrote:
>>>>>>>>>>
>>>>>>>>>>            Hi, There seems to he Something odd problem, maybe 
>>>>>>>>>> timing related. Stripped version not workingas expected. I 
>>>>>>>>>> get back to you, when i have it working.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>        There's no point in working on your patch. W/o proper 
>>>>>>>>>> justification it
>>>>>>>>>>        isn't acceptable anyway. And so far we still don't 
>>>>>>>>>> know which problem
>>>>>>>>>>        you actually have.
>>>>>>>>>>        FIRST please provide the requested logs and explain 
>>>>>>>>>> the actual problem
>>>>>>>>>>        (incl. the commit that caused the regression).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>            13. huhtik. 2020, 14.46, Lauri Jakku 
>>>>>>>>>> <ljakku77@...il.com <mailto:ljakku77@...il.com>> kirjoitti: 
>>>>>>>>>> Hi, Fair enough, i'll strip them. -lja On 2020-04-13 14:34, 
>>>>>>>>>> Leon Romanovsky wrote:
>>>>>>>>>>
>>>>>>>>>>            On Mon, Apr 13, 2020 at 02:02:01PM +0300, Lauri 
>>>>>>>>>> Jakku wrote: Hi, Comments inline. On 2020-04-13 13:58, Leon 
>>>>>>>>>> Romanovsky wrote: On Mon, Apr 13, 2020 at 01:30:13PM +0300, 
>>>>>>>>>> Lauri Jakku wrote: From 
>>>>>>>>>> 2d41edd4e6455187094f3a13d58c46eeee35aa31 Mon Sep 17 00:00:00 
>>>>>>>>>> 2001 From: Lauri Jakku <lja@....fi> Date: Mon, 13 Apr 2020 
>>>>>>>>>> 13:18:35 +0300 Subject: [PATCH] NET: r8168/r8169 identifying 
>>>>>>>>>> fix The driver installation determination made properly by 
>>>>>>>>>> checking PHY vs DRIVER id's. --- 
>>>>>>>>>> drivers/net/ethernet/realtek/r8169_main.c | 70 
>>>>>>>>>> ++++++++++++++++++++--- drivers/net/phy/mdio_bus.c | 11 +++- 
>>>>>>>>>> 2 files changed, 72 insertions(+), 9 deletions(-) I would say 
>>>>>>>>>> that most of the code is debug prints. I tought that they are 
>>>>>>>>>> helpful to keep, they are using the debug calls, so they are 
>>>>>>>>>> not visible if user does not like those. You are missing the 
>>>>>>>>>> point of who are your users. Users want to have working 
>>>>>>>>>> device and the code. They don't need or like to debug their 
>>>>>>>>>> kernel. Thanks
>>>>>>>>>>
>>>>>>>>>>

View attachment "refresh-network.service" of type "text/x-dbus-service" (546 bytes)

View attachment "refresh-network.target" of type "text/x-systemd-unit" (523 bytes)

Download attachment "refresh-network.sh" of type "application/x-shellscript" (459 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ