lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <527bcc43-d99c-f86e-29b0-2b4773226e38@helixd.com>
Date:   Mon, 26 Jul 2021 18:39:58 -0700
From:   Dario Alcocer <dalcocer@...ixd.com>
To:     Andrew Lunn <andrew@...n.ch>
Cc:     netdev@...r.kernel.org
Subject: Re: Marvell switch port shows LOWERLAYERDOWN, ping fails

On 7/24/21 7:36 PM, Dario Alcocer wrote:
> On 7/24/21 7:26 PM, Dario Alcocer wrote:
>> On 7/24/21 10:34 AM, Andrew Lunn wrote:
>>> You might want to enable dbg prints in driver/nets/phy/phy.c, so you
>>> can see the state machine changes.
>>
>> Great suggestion. I added the following to the boot options:
>>
>> dyndbg="file net/dsa/* +p; file drivers/net/phy/phy.c +p"
>>
>> The relevant messages collected from the system log are below. 
>> Interestingly, all of the ports go from UP to NOLINK. In addition, 
>> "breaking chain for DSA event 7" is reported, once for each port.

Andrew,

As I mentioned before, the system log shows that each switch port link 
state goes from UP to NOLINK when "ip link set DEVNAME up" runs.

Since you suggested it's probably a PHY problem, I used kernel tracing 
to track PHY-related calls in the following files:

* drivers/net/phy/phylink.c
* drivers/net/phy/marvell.c
* net/dsa/port.c

I set up the kernel tracing before trying to bring up the lan1 interface:

root@...i:~# mount -t tracefs tracefs /sys/kernel/tracing
root@...i:~# echo 0 > /sys/kernel/tracing/tracing_on
root@...i:~# echo function_graph > /sys/kernel/tracing/current_tracer
root@...i:~# echo phylink_\* > /sys/kernel/tracing/set_ftrace_filter
root@...i:~# echo dsa_port_phylink_\* >> 
/sys/kernel/tracing/set_ftrace_filter
root@...i:~# echo m88e1510_probe >> /sys/kernel/tracing/set_ftrace_filter
root@...i:~# echo marvell_config_init >> 
/sys/kernel/tracing/set_ftrace_filter
root@...i:~# echo m88e1510_config_aneg >> 
/sys/kernel/tracing/set_ftrace_filter
root@...i:~# echo marvell_read_status >> 
/sys/kernel/tracing/set_ftrace_filter
root@...i:~# echo marvell_ack_interrupt >> 
/sys/kernel/tracing/set_ftrace_filter
root@...i:~# echo marvell_config_intr >> 
/sys/kernel/tracing/set_ftrace_filter
root@...i:~# echo m88e1121_did_interrupt >> 
/sys/kernel/tracing/set_ftrace_filter
root@...i:~# echo genphy_resume >> /sys/kernel/tracing/set_ftrace_filter
root@...i:~# echo genphy_suspend >> /sys/kernel/tracing/set_ftrace_filter
root@...i:~# echo marvell_read_page >> /sys/kernel/tracing/set_ftrace_filter
root@...i:~# echo marvell_write_page >> 
/sys/kernel/tracing/set_ftrace_filter
root@...i:~# echo marvell_get_sset_count >> 
/sys/kernel/tracing/set_ftrace_filter
root@...i:~# echo marvell_get_strings >> 
/sys/kernel/tracing/set_ftrace_filter
root@...i:~# echo m88e1540_get_tunable >> 
/sys/kernel/tracing/set_ftrace_filter
root@...i:~# echo m88e1540_set_tunable >> 
/sys/kernel/tracing/set_ftrace_filter
root@...i:~# echo 1 > /sys/kernel/tracing/tracing_on

I then tried bringing up lan1 again to see which functions would be 
called, then stopped kernel tracing:

root@...i:~# ip addr add 192.0.2.1/24 dev lan1
root@...i:~# ip link set lan1 up
[  511.763909] mv88e6085 stmmac-0:1a lan1: configuring for phy/gmii link 
mode
[  511.773082] 8021q: adding VLAN 0 to HW filter on device lan1
root@...i:~# echo 0 > /sys/kernel/tracing/tracing_on

I then dumped the trace buffer:

root@...i:~# cat /sys/kernel/tracing/trace
# tracer: function_graph
#
# CPU  DURATION                  FUNCTION CALLS
# |     |   |                     |   |   |   |
  1) + 10.900 us   |  phylink_ethtool_ksettings_get();
  1)   5.220 us    |  phylink_ethtool_ksettings_get();
  1)               |  phylink_ethtool_ksettings_get() {
  1)   2.060 us    |    phylink_get_fixed_state();
  1)   1.890 us    |    phylink_get_ksettings();
  1) + 11.740 us   |  }
  1)   6.760 us    |  phylink_ethtool_ksettings_get();
  1)   4.890 us    |  phylink_ethtool_ksettings_get();
  1)   4.670 us    |  phylink_ethtool_ksettings_get();
  1) + 11.560 us   |  phylink_ethtool_ksettings_get();
  1)   5.060 us    |  phylink_ethtool_ksettings_get();
  1)               |  phylink_ethtool_ksettings_get() {
  1)   2.010 us    |    phylink_get_fixed_state();
  1)   1.810 us    |    phylink_get_ksettings();
  1) + 12.140 us   |  }
  1)   5.890 us    |  phylink_ethtool_ksettings_get();
  1)   5.300 us    |  phylink_ethtool_ksettings_get();
  1)   4.630 us    |  phylink_ethtool_ksettings_get();
  1) + 10.070 us   |  phylink_ethtool_ksettings_get();
  1)   5.100 us    |  phylink_ethtool_ksettings_get();
  1)               |  phylink_ethtool_ksettings_get() {
  1)   1.910 us    |    phylink_get_fixed_state();
  1)   1.840 us    |    phylink_get_ksettings();
  1) + 12.120 us   |  }
  1)   5.910 us    |  phylink_ethtool_ksettings_get();
  1)   5.410 us    |  phylink_ethtool_ksettings_get();
  1)   4.630 us    |  phylink_ethtool_ksettings_get();
  1) + 10.560 us   |  phylink_ethtool_ksettings_get();
  1)   4.620 us    |  phylink_ethtool_ksettings_get();
  1)               |  phylink_ethtool_ksettings_get() {
  1)   1.920 us    |    phylink_get_fixed_state();
  1)   1.930 us    |    phylink_get_ksettings();
  1) + 12.370 us   |  }
  1)   5.290 us    |  phylink_ethtool_ksettings_get();
  1)   4.570 us    |  phylink_ethtool_ksettings_get();
  1)   4.500 us    |  phylink_ethtool_ksettings_get();
  0)               |  phylink_start() {
  0)   2.510 us    |    phylink_resolve_flow();
  0)               |    phylink_mac_config() {
  0)   2.260 us    |      dsa_port_phylink_mac_config();
  0)   5.830 us    |    }
  0) + 14.740 us   |    phylink_run_resolve.part.0();
  0)               |    genphy_resume() {
  ------------------------------------------
  0)     ip-626     =>   kworker-20
  ------------------------------------------

  0)               |  phylink_resolve() {
  0)   1.860 us    |    phylink_resolve_flow();
  0)   6.800 us    |  }
  ------------------------------------------
  0)   kworker-20   =>     ip-626
  ------------------------------------------

  0) # 1818.330 us |    } /* genphy_resume */
  0) # 8730.320 us |  } /* phylink_start */
  ------------------------------------------
  0)     ip-626     =>   kworker-20
  ------------------------------------------

  0)               |  m88e1510_config_aneg() {
  0) # 1871.880 us |    marvell_read_page();
  0) # 1807.480 us |    marvell_write_page();
  0) # 1804.300 us |    marvell_write_page();
  0) * 42803.07 us |  }
  0) * 13283.95 us |  marvell_read_status();
  0)               |  phylink_phy_change() {
  0)   3.990 us    |    phylink_run_resolve.part.0();
  0)   8.930 us    |  }
  0)               |  phylink_resolve() {
  0)   2.110 us    |    phylink_resolve_flow();
  0)   6.320 us    |  }
  1) + 11.280 us   |  phylink_ethtool_ksettings_get();
  1)   4.560 us    |  phylink_ethtool_ksettings_get();
  1)               |  phylink_ethtool_ksettings_get() {
  1)   2.180 us    |    phylink_get_fixed_state();
  1)   1.960 us    |    phylink_get_ksettings();
  1) + 12.620 us   |  }
  1)   5.290 us    |  phylink_ethtool_ksettings_get();
  1)   4.980 us    |  phylink_ethtool_ksettings_get();
  1)   5.000 us    |  phylink_ethtool_ksettings_get();
  1) + 10.430 us   |  phylink_ethtool_ksettings_get();
  1)   4.860 us    |  phylink_ethtool_ksettings_get();
  1)               |  phylink_ethtool_ksettings_get() {
  1)   2.070 us    |    phylink_get_fixed_state();
  1)   1.950 us    |    phylink_get_ksettings();
  1) + 12.090 us   |  }
  1)   5.310 us    |  phylink_ethtool_ksettings_get();
  1)   5.110 us    |  phylink_ethtool_ksettings_get();
  1)   4.730 us    |  phylink_ethtool_ksettings_get();

I filtered the output to see which specific calls are made when checking 
the port link status:

root@...i:~# cat /sys/kernel/tracing/trace | fgrep '|' | cut -d '|' -f 2 
| sort | uniq

       dsa_port_phylink_mac_config();
     genphy_resume() {
     marvell_read_page();
     marvell_write_page();
     phylink_get_fixed_state();
     phylink_get_ksettings();
     phylink_mac_config() {
     phylink_resolve_flow();
     phylink_run_resolve.part.0();
     }
     } /* genphy_resume */
   m88e1510_config_aneg() {
   marvell_read_status();
   phylink_ethtool_ksettings_get() {
   phylink_ethtool_ksettings_get();
   phylink_phy_change() {
   phylink_resolve() {
   phylink_start() {
   }
   } /* phylink_start */
root@...i:~#

I will focus on adding more tracing to these specific functions, in 
hopes of narrowing down the link issue further.

Let me know if you have any other suggestions, in case I missed something.

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ