lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Mon, 27 Jul 2020 21:03:06 +0300
From:   Laurent Pinchart <laurent.pinchart@...asonboard.com>
To:     Chris Healy <cphealy@...il.com>
Cc:     Andrew Lunn <andrew@...n.ch>, Fugang Duan <fugang.duan@....com>,
        "David S. Miller" <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>,
        Martin Fuzzey <martin.fuzzey@...wbird.group>
Subject: Re: [RESENT PATCH net--stat 1/1] net: ethernet: fec: Revert "net:
 ethernet: fec: Replace interrupt driven MDIO with polled IO"

On Mon, Jul 27, 2020 at 08:37:20PM +0300, Laurent Pinchart wrote:
> On Mon, Jul 27, 2020 at 08:41:23AM -0700, Chris Healy wrote:
> > On Mon, Jul 27, 2020 at 8:24 AM Laurent Pinchart wrote:
> > > On Mon, Jul 27, 2020 at 02:05:45PM +0200, Andrew Lunn wrote:
> > > > On Sun, Jul 26, 2020 at 08:01:25PM -0700, Chris Healy wrote:
> > > > > It appears quite a few boards were affected by this micrel PHY driver change:
> > > > >
> > > > > 2ccb0161a0e9eb06f538557d38987e436fc39b8d
> > > > > 80bf72598663496d08b3c0231377db6a99d7fd68
> > > > > 2de00450c0126ec8838f72157577578e85cae5d8
> > > > > 820f8a870f6575acda1bf7f1a03c701c43ed5d79
> > > > >
> > > > > I just updated the phy-mode with my board from rgmii to rgmii-id and
> > > > > everything started working fine with net-next again:
> > > >
> > > > Hi Chris
> > > >
> > > > Is this a mainline supported board? Do you plan to submit a patch?
> > > >
> > > > Laurent, does the change also work for your board? This is another one
> > > > of those cases were a bug in the PHY driver, not respecting the
> > > > phy-mode, has masked a bug in the device tree, using the wrong
> > > > phy-mode. We had the same issue with the Atheros PHY a while back.
> > >
> > > Yes, setting the phy-mode to rgmii-id fixes the issue.
> > >
> > > Thank you everybody for your quick responses and very useful help !
> > >
> > > On a side note, when the kernel boots, there's a ~10s delay for the
> > > ethernet connection to come up:
> > >
> > > [    4.050754] Micrel KSZ9031 Gigabit PHY 30be0000.ethernet-1:01: attached PHY driver [Micrel KSZ9031 Gigabit PHY] (mii_bus:phy_addr=30be0000.ethernet-1:01, irq=POLL)
> > > [   15.628528] fec 30be0000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
> > > [   15.676961] Sending DHCP requests ., OK
> > > [   15.720925] IP-Config: Got DHCP answer from 192.168.2.47, my address is 192.168.2.210
> > >
> > > The LED on the connected switch confirms this, it lits up synchronously
> > > with the "Link is up" message. It's not an urgent issue, but if someone
> > > had a few pointers on how I could debug that, it would be appreciated.
> > 
> > Here's a few suggestions that could help in learning more:
> > 
> > 1) Review the KSZ9031 HW errata and compare against the PHY driver
> > code.  There's a number of errata that could cause this from my quick
> > review.
> 
> I'll have a look at that, thanks.

I thought issue 5 ("Auto-Negotiation link-up failure / long link-up time
due to default FLP interval setting") was a likely candidate, but it
seems it's already handled in the driver (implemented in
ksz9031_center_flp_timing()).

I've run a few more tests, adding a WARN_ON in ksz9031_config_init() to
trace its callers. It looks like the initial negotiation fails, until
ksz9031_read_status() restarts it after the maximum number of
iterations:

[    4.047515] ------------[ cut here ]------------
[    4.052388] WARNING: CPU: 0 PID: 1 at drivers/net/phy/micrel.c:693 ksz9031_config_init+0x34/0x344
[    4.061827] Modules linked in:
[    4.064932] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc6-00102-g6a84d7a1fa75-dirty #505
[    4.073570] Hardware name: Freescale i.MX7 Dual (Device Tree)
[    4.079364] [<c01119d4>] (unwind_backtrace) from [<c010c0d0>] (show_stack+0x10/0x14)
[    4.087147] [<c010c0d0>] (show_stack) from [<c055bd78>] (dump_stack+0xe8/0x120)
[    4.094498] [<c055bd78>] (dump_stack) from [<c0123888>] (__warn+0xc0/0x108)
[    4.101497] [<c0123888>] (__warn) from [<c0123c60>] (warn_slowpath_fmt+0x60/0xbc)
[    4.109018] [<c0123c60>] (warn_slowpath_fmt) from [<c0760558>] (ksz9031_config_init+0x34/0x344)
[    4.117754] [<c0760558>] (ksz9031_config_init) from [<c075b388>] (phy_attach_direct+0xfc/0x2b0)
[    4.126487] [<c075b388>] (phy_attach_direct) from [<c075b680>] (phy_connect_direct+0x1c/0x58)
[    4.135052] [<c075b680>] (phy_connect_direct) from [<c08d9bb8>] (of_phy_connect+0x38/0x60)
[    4.143354] [<c08d9bb8>] (of_phy_connect) from [<c0766920>] (fec_enet_mii_probe+0x40/0x1ac)
[    4.151741] [<c0766920>] (fec_enet_mii_probe) from [<c0769370>] (fec_enet_open+0x27c/0x340)
[    4.160130] [<c0769370>] (fec_enet_open) from [<c096f45c>] (__dev_open+0xd0/0x158)
[    4.167734] [<c096f45c>] (__dev_open) from [<c096f830>] (__dev_change_flags+0x168/0x1d4)
[    4.175860] [<c096f830>] (__dev_change_flags) from [<c096f8b4>] (dev_change_flags+0x18/0x48)
[    4.184338] [<c096f8b4>] (dev_change_flags) from [<c113d390>] (ip_auto_config+0x270/0x1068)
[    4.192727] [<c113d390>] (ip_auto_config) from [<c01021a4>] (do_one_initcall+0x80/0x348)
[    4.200855] [<c01021a4>] (do_one_initcall) from [<c1100fd0>] (kernel_init_freeable+0x15c/0x20c)
[    4.209591] [<c1100fd0>] (kernel_init_freeable) from [<c0bd7d44>] (kernel_init+0x8/0x114)
[    4.217804] [<c0bd7d44>] (kernel_init) from [<c0100134>] (ret_from_fork+0x14/0x20)
[    4.225402] Exception stack(0xec0edfb0 to 0xec0edff8)
[    4.230484] dfa0:                                     00000000 00000000 00000000 00000000
[    4.238692] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    4.246899] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[    4.254068] irq event stamp: 270123
[    4.257672] hardirqs last  enabled at (270141): [<c018dddc>] console_unlock+0x41c/0x5d8
[    4.265710] hardirqs last disabled at (270148): [<c018da68>] console_unlock+0xa8/0x5d8
[    4.273754] softirqs last  enabled at (270166): [<c0101580>] __do_softirq+0x2b8/0x508
[    4.281683] softirqs last disabled at (270177): [<c012b974>] irq_exit+0xfc/0x17c
[    4.289194] ---[ end trace 620517544bb2f528 ]---
[    4.295734] ksz9031_center_flp_timing
[    4.299781] ksz9031_center_flp_timing: restart negotiation
[    4.305402] ksz9031_config_init: 0
[    4.311116] Micrel KSZ9031 Gigabit PHY 30be0000.ethernet-1:01: attached PHY driver [Micrel KSZ9031 Gigabit PHY] (mii_bus:phy_addr=30be0000.ethernet-1:01, irq=POLL)
[   11.677512] ------------[ cut here ]------------
[   11.682341] WARNING: CPU: 0 PID: 12 at drivers/net/phy/micrel.c:693 ksz9031_config_init+0x34/0x344
[   11.691410] Modules linked in:
[   11.694515] CPU: 0 PID: 12 Comm: kworker/0:1 Tainted: G        W         5.8.0-rc6-00102-g6a84d7a1fa75-dirty #505
[   11.704803] Hardware name: Freescale i.MX7 Dual (Device Tree)
[   11.710591] Workqueue: events_power_efficient phy_state_machine
[   11.716561] [<c01119d4>] (unwind_backtrace) from [<c010c0d0>] (show_stack+0x10/0x14)
[   11.724346] [<c010c0d0>] (show_stack) from [<c055bd78>] (dump_stack+0xe8/0x120)
[   11.731699] [<c055bd78>] (dump_stack) from [<c0123888>] (__warn+0xc0/0x108)
[   11.738700] [<c0123888>] (__warn) from [<c0123c60>] (warn_slowpath_fmt+0x60/0xbc)
[   11.746223] [<c0123c60>] (warn_slowpath_fmt) from [<c0760558>] (ksz9031_config_init+0x34/0x344)
[   11.754960] [<c0760558>] (ksz9031_config_init) from [<c0760b78>] (ksz9031_read_status+0x40/0x84)
[   11.763785] [<c0760b78>] (ksz9031_read_status) from [<c0757414>] (phy_check_link_status+0x54/0xec)
[   11.772783] [<c0757414>] (phy_check_link_status) from [<c0758318>] (phy_state_machine+0x190/0x204)
[   11.781781] [<c0758318>] (phy_state_machine) from [<c0143a5c>] (process_one_work+0x2d0/0x778)
[   11.790343] [<c0143a5c>] (process_one_work) from [<c0143f30>] (worker_thread+0x2c/0x588)
[   11.798472] [<c0143f30>] (worker_thread) from [<c014b240>] (kthread+0x130/0x144)
[   11.805907] [<c014b240>] (kthread) from [<c0100134>] (ret_from_fork+0x14/0x20)
[   11.813159] Exception stack(0xec12bfb0 to 0xec12bff8)
[   11.818243] bfa0:                                     00000000 00000000 00000000 00000000
[   11.826454] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[   11.834662] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[   11.841402] irq event stamp: 15245
[   11.844848] hardirqs last  enabled at (15253): [<c018dddc>] console_unlock+0x41c/0x5d8
[   11.852868] hardirqs last disabled at (15270): [<c018da68>] console_unlock+0xa8/0x5d8
[   11.860800] softirqs last  enabled at (15286): [<c0101580>] __do_softirq+0x2b8/0x508
[   11.868647] softirqs last disabled at (15297): [<c012b974>] irq_exit+0xfc/0x17c
[   11.875990] ---[ end trace 620517544bb2f529 ]---
[   11.881668] ksz9031_center_flp_timing
[   11.885649] ksz9031_center_flp_timing: restart negotiation
[   11.891308] ksz9031_config_init: 0
[   14.988788] fec 30be0000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx

> > 2) Based on what I read in the HW errata, try different link partners
> > that utilize different copper PHYs to see if it results in different
> > behaviour.
> 
> I have limited available test equipment, but I can give it a try.
> 
> > 3) Try setting your autonegotiate advertisement to only advertise
> > 100Mbps and see if this affects the timing.  Obviously this would not
> > be a solution but might help in better understanding the issue.
> 
> I've tested this, and the link then comes up in ~2 seconds instead of
> ~10. That's clearly an improvement, but I have no idea what it implies
> :-)
> 
> [    4.090655] Micrel KSZ9031 Gigabit PHY 30be0000.ethernet-1:01: attached PHY driver [Micrel KSZ9031 Gigabit PHY] (mii_bus:phy_addr=30be0000.ethernet-1:01, irq=POLL)
> [    6.188347] fec 30be0000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
> [    6.236843] Sending DHCP requests ., OK
> [    6.280807] IP-Config: Got DHCP answer from 192.168.2.47, my address is 192.168.2.210

-- 
Regards,

Laurent Pinchart

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ