[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <db508b70-e5fb-2abf-8012-c168fe7535a7@pp.inet.fi>
Date: Sun, 19 Apr 2020 18:09:06 +0300
From: Lauri Jakku <lauri.jakku@...inet.fi>
To: Heiner Kallweit <hkallweit1@...il.com>
Cc: Leon Romanovsky <leon@...nel.org>, netdev@...r.kernel.org,
nic_swsd@...ltek.com
Subject: Re: NET: r8168/r8169 identifying fix
Hi,
On 18.4.2020 21.46, Lauri Jakku wrote:
> Hi,
>
> On 18.4.2020 14.06, Lauri Jakku wrote:
>> Hi,
>>
>> On 17.4.2020 10.30, Lauri Jakku wrote:
>>> Hi,
>>>
>>> On 17.4.2020 9.23, Lauri Jakku wrote:
>>>>
>>>> On 16.4.2020 23.50, Heiner Kallweit wrote:
>>>>> On 16.04.2020 22:38, Lauri Jakku wrote:
>>>>>> Hi
>>>>>>
>>>>>> On 16.4.2020 23.10, Lauri Jakku wrote:
>>>>>>> On 16.4.2020 23.02, Heiner Kallweit wrote:
>>>>>>>> On 16.04.2020 21:58, Lauri Jakku wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> On 16.4.2020 21.37, Lauri Jakku wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> On 16.4.2020 21.26, Heiner Kallweit wrote:
>>>>>>>>>>> On 16.04.2020 13:30, Lauri Jakku wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 5.6.3-2-MANJARO: stock manjaro kernel, without
>>>>>>>>>>>> modifications --> network does not work
>>>>>>>>>>>>
>>>>>>>>>>>> 5.6.3-2-MANJARO-lja: No attach check, modified kernel
>>>>>>>>>>>> (r8169 mods only) --> network does not work
>>>>>>>>>>>>
>>>>>>>>>>>> 5.6.3-2-MANJARO-with-the-r8169-patch: phy patched + r8169
>>>>>>>>>>>> mods -> devices show up ok, network works
>>>>>>>>>>>>
>>>>>>>>>>>> All different initcpio's have realtek.ko in them.
>>>>>>>>>>>>
>>>>>>>>>>> Thanks for the logs. Based on the logs you're presumable
>>>>>>>>>>> affected by a known BIOS bug.
>>>>>>>>>>> Check bug tickets 202275 and 207203 at bugzilla.kernel.org.
>>>>>>>>>>> In the first referenced tickets it's about the same
>>>>>>>>>>> mainboard (with earlier BIOS version).
>>>>>>>>>>> BIOS on this mainboard seems to not initialize the network
>>>>>>>>>>> chip / PHY correctly, it reports
>>>>>>>>>>> a random number as PHY ID, resulting in no PHY driver being
>>>>>>>>>>> found.
>>>>>>>>>>> Enable "Onboard LAN Boot ROM" in the BIOS, and your problem
>>>>>>>>>>> should be gone.
>>>>>>>>>>>
>>>>>>>>>> OK, I try that, thank you :)
>>>>>>>>>>
>>>>>>>>> It seems that i DO have the ROM's enabled, i'm now testing
>>>>>>>>> some mutex guard for phy state and try to use it as indicator
>>>>>>>>>
>>>>>>>>> that attach has been done. One thing i've noticed is that
>>>>>>>>> driver needs to be reloaded to allow traffic (ie. ping works
>>>>>>>>> etc.)
>>>>>>>>>
>>>>>>>> All that shouldn't be needed. Just check with which PHY ID the
>>>>>>>> PHY comes up.
>>>>>>>> And what do you mean with "it seems"? Is the option enabled or
>>>>>>>> not?
>>>>>>>>
>>>>>>> I do have ROM's enabled, and it does not help with my issue.
>>>>> Your BIOS is a beta version, downgrading to F7 may help. Then you
>>>>> have the same
>>>>> mainboard + BIOS as the user who opened bug ticket 202275.
>>>>>
>>>> huhti 17 09:01:49 MinistryOfSillyWalk kernel: r8169 0000:03:00.0:
>>>> PHY version: 0xc2077002
>>>> huhti 17 09:01:49 MinistryOfSillyWalk kernel: r8169 0000:03:00.0:
>>>> MAC version: 23
>>>>
>>>> ....
>>>>
>>>> huhti 17 09:03:29 MinistryOfSillyWalk kernel: r8169 0000:02:00.0:
>>>> PHY version: 0x1cc912
>>>>
>>>> huhti 17 09:03:29 MinistryOfSillyWalk kernel: r8169 0000:02:00.0:
>>>> MAC version: 23
>>>>
>>>> .. after module unload & load cycle:
>>>>
>>>> huhti 17 09:17:35 MinistryOfSillyWalk kernel: r8169 0000:02:00.0:
>>>> PHY version: 0x1cc912
>>>> huhti 17 09:17:35 MinistryOfSillyWalk kernel: r8169 0000:02:00.0:
>>>> MAC version: 23
>>>>
>>>>
>>>> it seem to be the case that the phy_id chances onetime, then stays
>>>> the same. I'll do few shutdowns and see
>>>>
>>>> is there a pattern at all .. next i'm going to try how it behaves,
>>>> if i read mac/phy versions twice on MAC version 23.
>>>>
>>>>
>>>> The BIOS downgrade: I'd like to solve this without downgrading
>>>> BIOS. If I can't, then I'll do systemd-service that
>>>>
>>>> reloads r8169 driver at boot, cause then network is just fine.
>>>>
>>>>
>>> What i've gathered samples now, there is three values for PHY ID:
>>>
>>> [sillyme@...istryOfSillyWalk KernelStuff]$ sudo journalctl |grep
>>> "PHY ver"
>>> huhti 17 09:01:49 MinistryOfSillyWalk kernel: r8169 0000:02:00.0:
>>> PHY version: 0xc2077002
>>> huhti 17 09:01:49 MinistryOfSillyWalk kernel: r8169 0000:03:00.0:
>>> PHY version: 0xc2077002
>>> huhti 17 09:03:29 MinistryOfSillyWalk kernel: r8169 0000:02:00.0:
>>> PHY version: 0x1cc912
>>> huhti 17 09:03:29 MinistryOfSillyWalk kernel: r8169 0000:03:00.0:
>>> PHY version: 0x1cc912
>>> huhti 17 09:17:35 MinistryOfSillyWalk kernel: r8169 0000:02:00.0:
>>> PHY version: 0x1cc912
>>> huhti 17 09:17:35 MinistryOfSillyWalk kernel: r8169 0000:03:00.0:
>>> PHY version: 0x1cc912
>>> huhti 17 09:24:53 MinistryOfSillyWalk kernel: r8169 0000:02:00.0:
>>> PHY version: 0xc1071002
>>> huhti 17 09:24:53 MinistryOfSillyWalk kernel: r8169 0000:03:00.0:
>>> PHY version: 0xc1071002
>>> huhti 17 09:27:59 MinistryOfSillyWalk kernel: r8169 0000:02:00.0:
>>> PHY version: 0x1cc912
>>> huhti 17 09:27:59 MinistryOfSillyWalk kernel: r8169 0000:03:00.0:
>>> PHY version: 0x1cc912
>>> huhti 17 10:08:42 MinistryOfSillyWalk kernel: r8169 0000:02:00.0:
>>> PHY version: 0xc1071002
>>> huhti 17 10:08:42 MinistryOfSillyWalk kernel: r8169 0000:03:00.0:
>>> PHY version: 0xc1071002
>>> huhti 17 10:12:07 MinistryOfSillyWalk kernel: r8169 0000:02:00.0:
>>> PHY version: 0x1cc912
>>> huhti 17 10:12:07 MinistryOfSillyWalk kernel: r8169 0000:03:00.0:
>>> PHY version: 0x1cc912
>>> huhti 17 10:20:35 MinistryOfSillyWalk kernel: r8169 0000:02:00.0:
>>> PHY version: 0xc1071002
>>> huhti 17 10:20:35 MinistryOfSillyWalk kernel: r8169 0000:03:00.0:
>>> PHY version: 0xc1071002
>>> huhti 17 10:23:46 MinistryOfSillyWalk kernel: r8169 0000:02:00.0:
>>> PHY version: 0x1cc912
>>> huhti 17 10:23:46 MinistryOfSillyWalk kernel: r8169 0000:03:00.0:
>>> PHY version: 0x1cc912
>>>
>>> I dont know are those hard coded or what, and are they device
>>> specific how much.
>>>
>>> i haven't coldbooted things up, that may be that something to check
>>> do they vary how per coldboot.
>>>
>>>>>> I check the ID, and revert all other changes, and check how it is
>>>>>> working after adding the PHY id to list.
>>>>>>
>> What i've now learned: the patch + script + journald services ->
>> Results working network, but it is still a workaround.
>>
> Following patch trusts the MAC version, another thing witch could help
> is to derive method to do 2dn pass of the probeing:
>
> if specific MAC version is found.
>
> diff --git a/drivers/net/ethernet/realtek/r8169_main.c
> b/drivers/net/ethernet/realtek/r8169_main.c
> index acd122a88d4a..62b37a1abc24 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c
> @@ -5172,13 +5172,18 @@ static int r8169_mdio_register(struct
> rtl8169_private *tp)
> if (!tp->phydev) {
> mdiobus_unregister(new_bus);
> return -ENODEV;
> - } else if (tp->mac_version == RTL_GIGA_MAC_NONE) {
> - /* Most chip versions fail with the genphy driver.
> - * Therefore ensure that the dedicated PHY driver is
> loaded.
> - */
> - dev_err(&pdev->dev, "Not known MAC version.\n");
> - mdiobus_unregister(new_bus);
> - return -EUNATCH;
> + } else {
> + dev_info(&pdev->dev, "PHY version: 0x%x\n",
> tp->phydev->phy_id);
> + dev_info(&pdev->dev, "MAC version: %d\n",
> tp->mac_version);
> +
> + if (tp->mac_version == RTL_GIGA_MAC_NONE) {
> + /* Most chip versions fail with the genphy
> driver.
> + * Therefore ensure that the dedicated PHY
> driver is loaded.
> + */
> + dev_err(&pdev->dev, "Not known MAC/PHY
> version.\n", tp->phydev->phy_id);
> + mdiobus_unregister(new_bus);
> + return -EUNATCH;
> + }
> }
>
> /* PHY will be woken up in rtl_open() */
>
I just got bleeding edge 5.7.0-1 kernel compiled + firmware's updated..
and now up'n'running. There is one (WARN_ONCE) stack trace coming from
driver, i think i tinker with it next, with above patch the network
devices shows up and they can be configured.
>>
>>>>>>>>>>>> The problem with old method seems to be, that device does
>>>>>>>>>>>> not have had time to attach before the
>>>>>>>>>>>> PHY driver check.
>>>>>>>>>>>>
>>>>>>>>>>>> The patch:
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/drivers/net/ethernet/realtek/r8169_main.c
>>>>>>>>>>>> b/drivers/net/ethernet/realtek/r8169_main.c
>>>>>>>>>>>> index bf5bf05970a2..acd122a88d4a 100644
>>>>>>>>>>>> --- a/drivers/net/ethernet/realtek/r8169_main.c
>>>>>>>>>>>> +++ b/drivers/net/ethernet/realtek/r8169_main.c
>>>>>>>>>>>> @@ -5172,11 +5172,11 @@ static int
>>>>>>>>>>>> r8169_mdio_register(struct rtl8169_private *tp)
>>>>>>>>>>>> if (!tp->phydev) {
>>>>>>>>>>>> mdiobus_unregister(new_bus);
>>>>>>>>>>>> return -ENODEV;
>>>>>>>>>>>> - } else if (!tp->phydev->drv) {
>>>>>>>>>>>> + } else if (tp->mac_version == RTL_GIGA_MAC_NONE) {
>>>>>>>>>>>> /* Most chip versions fail with the
>>>>>>>>>>>> genphy driver.
>>>>>>>>>>>> * Therefore ensure that the dedicated
>>>>>>>>>>>> PHY driver is loaded.
>>>>>>>>>>>> */
>>>>>>>>>>>> - dev_err(&pdev->dev, "realtek.ko not loaded,
>>>>>>>>>>>> maybe it needs to be added to initramfs?\n");
>>>>>>>>>>>> + dev_err(&pdev->dev, "Not known MAC
>>>>>>>>>>>> version.\n");
>>>>>>>>>>>> mdiobus_unregister(new_bus);
>>>>>>>>>>>> return -EUNATCH;
>>>>>>>>>>>> }
>>>>>>>>>>>> diff --git a/drivers/net/phy/phy-core.c
>>>>>>>>>>>> b/drivers/net/phy/phy-core.c
>>>>>>>>>>>> index 66b8c61ca74c..aba2b304b821 100644
>>>>>>>>>>>> --- a/drivers/net/phy/phy-core.c
>>>>>>>>>>>> +++ b/drivers/net/phy/phy-core.c
>>>>>>>>>>>> @@ -704,6 +704,10 @@ EXPORT_SYMBOL_GPL(phy_modify_mmd);
>>>>>>>>>>>> static int __phy_read_page(struct phy_device *phydev)
>>>>>>>>>>>> {
>>>>>>>>>>>> + /* If not attached, do nothing (no warning) */
>>>>>>>>>>>> + if (!phydev->attached_dev)
>>>>>>>>>>>> + return -EOPNOTSUPP;
>>>>>>>>>>>> +
>>>>>>>>>>>> if (WARN_ONCE(!phydev->drv->read_page,
>>>>>>>>>>>> "read_page callback not available, PHY driver not loaded?\n"))
>>>>>>>>>>>> return -EOPNOTSUPP;
>>>>>>>>>>>> @@ -712,12 +716,17 @@ static int __phy_read_page(struct
>>>>>>>>>>>> phy_device *phydev)
>>>>>>>>>>>> static int __phy_write_page(struct phy_device
>>>>>>>>>>>> *phydev, int page)
>>>>>>>>>>>> {
>>>>>>>>>>>> + /* If not attached, do nothing (no warning) */
>>>>>>>>>>>> + if (!phydev->attached_dev)
>>>>>>>>>>>> + return -EOPNOTSUPP;
>>>>>>>>>>>> +
>>>>>>>>>>>> if (WARN_ONCE(!phydev->drv->write_page,
>>>>>>>>>>>> "write_page callback not available, PHY driver not
>>>>>>>>>>>> loaded?\n"))
>>>>>>>>>>>> return -EOPNOTSUPP;
>>>>>>>>>>>> return phydev->drv->write_page(phydev, page);
>>>>>>>>>>>> }
>>>>>>>>>>>> +
>>>>>>>>>>>> /**
>>>>>>>>>>>> * phy_save_page() - take the bus lock and save the
>>>>>>>>>>>> current page
>>>>>>>>>>>> * @phydev: a pointer to a &struct phy_device
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 15. huhtik. 2020, 19.18, Heiner Kallweit
>>>>>>>>>>>> <hkallweit1@...il.com <mailto:hkallweit1@...il.com>>
>>>>>>>>>>>> kirjoitti:
>>>>>>>>>>>>
>>>>>>>>>>>> On 15.04.2020 16:39, Lauri Jakku wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi, There seems to he Something odd problem,
>>>>>>>>>>>> maybe timing related. Stripped version not workingas
>>>>>>>>>>>> expected. I get back to you, when i have it working.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> There's no point in working on your patch. W/o
>>>>>>>>>>>> proper justification it
>>>>>>>>>>>> isn't acceptable anyway. And so far we still don't
>>>>>>>>>>>> know which problem
>>>>>>>>>>>> you actually have.
>>>>>>>>>>>> FIRST please provide the requested logs and explain
>>>>>>>>>>>> the actual problem
>>>>>>>>>>>> (incl. the commit that caused the regression).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 13. huhtik. 2020, 14.46, Lauri Jakku
>>>>>>>>>>>> <ljakku77@...il.com <mailto:ljakku77@...il.com>> kirjoitti:
>>>>>>>>>>>> Hi, Fair enough, i'll strip them. -lja On 2020-04-13 14:34,
>>>>>>>>>>>> Leon Romanovsky wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Apr 13, 2020 at 02:02:01PM +0300, Lauri
>>>>>>>>>>>> Jakku wrote: Hi, Comments inline. On 2020-04-13 13:58, Leon
>>>>>>>>>>>> Romanovsky wrote: On Mon, Apr 13, 2020 at 01:30:13PM +0300,
>>>>>>>>>>>> Lauri Jakku wrote: From
>>>>>>>>>>>> 2d41edd4e6455187094f3a13d58c46eeee35aa31 Mon Sep 17
>>>>>>>>>>>> 00:00:00 2001 From: Lauri Jakku <lja@....fi> Date: Mon, 13
>>>>>>>>>>>> Apr 2020 13:18:35 +0300 Subject: [PATCH] NET: r8168/r8169
>>>>>>>>>>>> identifying fix The driver installation determination made
>>>>>>>>>>>> properly by checking PHY vs DRIVER id's. ---
>>>>>>>>>>>> drivers/net/ethernet/realtek/r8169_main.c | 70
>>>>>>>>>>>> ++++++++++++++++++++--- drivers/net/phy/mdio_bus.c | 11
>>>>>>>>>>>> +++- 2 files changed, 72 insertions(+), 9 deletions(-) I
>>>>>>>>>>>> would say that most of the code is debug prints. I tought
>>>>>>>>>>>> that they are helpful to keep, they are using the debug
>>>>>>>>>>>> calls, so they are not visible if user does not like those.
>>>>>>>>>>>> You are missing the point of who are your users. Users want
>>>>>>>>>>>> to have working device and the code. They don't need or
>>>>>>>>>>>> like to debug their kernel. Thanks
>>>>>>>>>>>>
>>>>>>>>>>>>
Powered by blists - more mailing lists