[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6dd3f82a-eea0-c9b4-8dd6-f2f313578b1f@gmail.com>
Date: Sun, 24 Feb 2019 10:10:57 +0100
From: Heiner Kallweit <hkallweit1@...il.com>
To: Andrew Lunn <andrew@...n.ch>
Cc: Russell King - ARM Linux admin <linux@...linux.org.uk>,
Florian Fainelli <f.fainelli@...il.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: No traffic with Marvell switch and latest linux-next
On 24.02.2019 00:42, Andrew Lunn wrote:
>> it took me quite some time to debug this issue ..
>>
>> At first a bisect pointed to one of my commits:
>> 5502b218e001 ("net: phy: use phy_resolve_aneg_linkmode in genphy_read_status")
>>
>> Further digging lead me to some suspicious dsa code:
>> In dsa_port_fixed_link_register_of() there's a call to genphy_read_status().
>> At the time of the call phydev->advertising is empty, therefore the fixed phy
>> settings are overwritten with defaults (10/half) what breaks the system.
>>
>> Worth to be mentioned is that for the PHY these two flags are set:
>> - is_pseudo_fixed_link (that's ok)
>> - autoneg -> I'm not sure this is correct.
>>
>> It seems that you once added the code in question:
>> 39b0c705195e ("net: dsa: Allow configuration of CPU & DSA port speeds/duplex")
>
> Hi Heiner
>
> For Ethernet switches, you can have device trees like this:
>
> switch0: switch@0 {
> compatible = "marvell,mv88e6085";
> pinctrl-0 = <&pinctrl_gpio_switch0>;
> pinctrl-names = "default";
> reg = <0>;
> dsa,member = <0 0>;
> interrupt-parent = <&gpio0>;
> interrupts = <27 IRQ_TYPE_LEVEL_LOW>;
> interrupt-controller;
> #interrupt-cells = <2>;
> eeprom-length = <512>;
>
> ports {
> #address-cells = <1>;
> #size-cells = <0>;
>
> port@0 {
> reg = <0>;
> label = "lan0";
> phy-handle = <&switch0phy0>;
> };
>
> ...
> switch0port5: port@5 {
> reg = <5>;
> label = "dsa";
> phy-mode = "rgmii-txid";
> link = <&switch1port6
> &switch2port9>;
> fixed-link {
> speed = <1000>;
> full-duplex;
> };
> };
>
> This is a DSA port. It is used to connect two Ethernet switches
> together. In this case, the MACs are connected back to back. There are
> no PHYs involved. These ports don't have a netdev associated with
> them, since they are just pipes between switches, not user interfaces.
>
> If i remember correctly, the code you are looking at was to deal with
> the "rgmii-txid". Without the TXID, i could not get frames to pass
> between the ports.
>
> There is a second use case as well. The Vybrid FEC ethernet controller
> is a Fast Ethernet port. The switch ports are 1G. DSA drivers set the
> port connecting to the SoC to its highest speed. Again, there is no
> netdev associated to the switch port, and generally the MAC ports are
> connected back to back. This 100 vs 1000 does not work for the
> Vybrid. So we add a fixed PHY to the CPU port.
>
> port@6 {
> reg = <6>;
> label = "cpu";
> ethernet = <&fec1>;
>
> fixed-link {
> speed = <100>;
> full-duplex;
> };
> };
>
> And there is a third use case:
>
> port@4 {
> reg = <4>;
> label = "optical4";
>
> fixed-link {
> speed = <1000>;
> full-duplex;
> link-gpios = <&gpio6 3
> GPIO_ACTIVE_HIGH>;
> };
> };
>
> We have a GPIO which tells us if the link it up, because there is not
> a PHY here, but an optical module. The GPIO tells us about loss of
> signal. Unfortunately we cannot use the SFP driver here, because of a
> hardware issue.
>
> In all cases end up in the same code. We want to tell the MAC to
> configure RGMII delays, or to configure the port speed, or to have the
> correct initial link state. The adjust_link() call can do this, if
> passed a phydev. And we have a phydev for the fixed-link PHY. However,
> since it has never been attached to a netdev, phy_start() called, etc,
> the state information is maybe not correct. So we call config_init()
> and read_status(), so phydev should reflect the state of the
> fixed-link PHY. And a fixed-link PHY is always c22, and it can be
> driven by genphy. Take a look at drivers/net/phy/swphy.c which is the
> core of the simulation, and fixed_phy.c
>
Thanks a lot for the very comprehensive explanation!
I think what's not correct is that phydev->autoneg is set
(by phy_device_create) for a fixed link. genphy_config_init() ensures
that aneg isn't set in the "supported" bitmap, but it doesn't care
about phydev->autoneg. IMO we need to consider this at few places:
- genphy_config_init
It should clear phydev->autoneg if aneg isn't supported.
- phy_probe
After having read the chip features we should clear phydev->autoneg
if aneg isn't supported.
- In __fixed_phy_register() we should clear phydev->autoneg.
>> I did what I like to do most and removed some code.
>> W/o the calls to genphy_config_init() and genphy_read_status() it works again.
>> Do these calls have some purpose here with a fixed link?
>
> Maybe with all the core changes, these calls are no longer needed? We
> just need to ensure the state in phydev reflects the state of the
> fixed link, i.e. in this case 100 Full.
>
I think at least the call to genphy_config_init can be removed. I don't
really like the name of this function anyway, because it doesn't
initialize anything but is a sanity check that reads chip capabilities
and removes unsupported modes from the "supported" bitmap.
genphy_read_status() we need to read the link status from GPIO
(if applicable in the use case). To be more precise, most likely
calling genphy_update_link() would be sufficient because speed and
duplex are fixed.
> Looking forward, at some point we are going to have to make fixed-link
> support higher speeds. That probably means we need a swphy-c45 which
> emulates the standard registers for 2.5G, 5G and 10G. At that point
> genphy will not work...
>
> Andrew
> .
>
Heiner
Powered by blists - more mailing lists