[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4a0bf7cf-d108-49ac-ac7c-6136a070c44b@intel.com>
Date: Mon, 6 May 2024 14:18:33 -0700
From: Jacob Keller <jacob.e.keller@...el.com>
To: <kernel.org-fo5k2w@...arbi.fr>, Jeff Daly <jeffd@...icom-usa.com>
CC: <anthony.l.nguyen@...el.com>, <intel-wired-lan@...ts.osuosl.org>,
<jesse.brandeburg@...el.com>, <netdev@...r.kernel.org>
Subject: Re: Non-functional ixgbe driver between Intel X553 chipset and Cisco
switch via kernel >=6.1 under Debian
On 5/4/2024 6:29 AM, kernel.org-fo5k2w@...arbi.fr wrote:
> Hi,
>
> > I haven't touched the ixgbe driver and hardware in many years, but I'll
> try to see what I can do to help.
>
> Thank you very much for your reply. I'll answer you point by point.
> I upgraded the Qoton to Debian 13 (testing) with kernel 6.6.15 (amd64)
> to be even more up to date.
> A quick test with Fedora 40 shows the same problem.
>
>
Thanks for the detailed information.
> > So everything works when connected back to back with the Connectx-3. Ok.
>
> Yes, exactly. Everything works as expected with the Connectx-3.
>
>
> > To confirm, you use the same cable in both cases?
>
> Yes, the same cable. I tested two different models:
> - 1 Cisco SFP-H10GB-CU1M (1 mètre)
> - 1 Cisco SFP-H10GB-CU3M (3 mètres)
>
> I'm only using the SFP-H10GB-CU3M for the rest for convenience.
>
>
> > But on the switch, the link is reported up until we bring the interface
> > up in ixgbe, and then link drops and stays down indefinitely?
>
> After initial start-up of the Qotom :
> # Port 10Gbe LEDs are green (please note that the MAC address OID -
> 20:7c:14 - is registered to Qotom, not Intel).
> ip link show dev eno1
> 7: eno1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
> DEFAULT group default qlen 1000
> link/ether 20:7c:14:xx:xx:xx brd ff:ff:ff:ff:ff:ff
> altname enp11s0f0
>
> # Cisco (Green LEDs - port mounted)
> show running-config | section interface TenGigabitEthernet1/0/1
> interface TenGigabitEthernet1/0/1
> no cdp enable
>
> show interface status | include Te1/0/1
> Te1/0/1 --- Vers Qotom --- connected trunk full 10G
> SFP-10GBase-CX1
>
> show ip interface brief | include Te1/0/1 | Status
> Interface IP-Address OK? Method Status
> Protocol
> Te1/0/1 unassigned YES unset up up
>
> The Cisco and Qotom ports are lit and flashing as if they were
> exchanging ARP or STP traffic. A mirror port on the Cisco's 10Gbe
> interface, however, shows no frame exchange. I connected a PC to port
> g1/0/13 with Wireshark for this test.
>
> monitor session 1 source interface t1/0/1 both
> monitor session 1 destination interface g1/0/13
>
> Port switch-on test :
> # Starting up the Qotom 10Gbe network interface
> ip link set eno1 up
> [ 1770.476075] pps pps5: new PPS source ptp5
> [ 1770.480784] ixgbe 0000:0b:00.0: registered PHC device on eno1
> [ 1770.575496] ixgbe 0000:0b:00.0 eno1: detected SFP+: 3
>
> # The ports on both devices switch off immediately.
> # There's no going back:
> ip link set eno1 down
> [ 1831.329797] ixgbe 0000:0b:00.0: removed PHC on eno1
>
> # The ports are always off on both sides even when unloading the ixgbe
> core module and plugging/unplugging the Cisco SFP-H10GB-CU3M :
> rmmod ixgbe
> [ 1872.503663] ixgbe 0000:0d:00.1: complete
> [ 1872.547628] ixgbe 0000:0d:00.0: complete
> [ 1872.591645] ixgbe 0000:0b:00.1: complete
> [ 1872.631725] ixgbe 0000:0b:00.0: complete
>
> A reboot is the only way to restore this port switch-on state.
> On startup, the Cisco switch displays the following logs (the date is
> not configured):
> Sep 30 14:33:00: %LINK-3-UPDOWN: Interface TenGigabitEthernet1/0/1,
> changed state to up
> Sep 30 14:33:01: %LINEPROTO-5-UPDOWN: Line protocol on Interface
> TenGigabitEthernet1/0/1, changed state to up
>
>
> > But if you use the out-of-tree ixgbe driver everything works. Hmm.
>
> Yes, that's exactly it. The driver on the Intel site works perfectly.
>
> > I tried checking the out-of-tree versions to see if there were any
> > obvious fixes. I didn't find anything. The code between the in-kernel
> > and out-of-tree is so different that it is hard to track down. At first
> > I wondered if this might be a regression due to recent changes to
> > support new hardware, but it appears that v6.1 is from before a lot of
> > that work went in.
>
> If it helps, vesalius' post of December 3, 2023 on one of the links in
> my original post
> (https://forum.proxmox.com/threads/intel-x553-sfp-ixgbe-no-go-on-pve8.135129/post-612291)
> reports that the following commit has been suspected as the culprit:
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v6.1.63&id=565736048bd5f9888990569993c6b6bfdf6dcb6d
>
I'm taking a look at this commit. I see that it was done by someone from
Silicom, and says the following:
> ixgbe: Manual AN-37 for troublesome link partners for X550 SFI
> Some (Juniper MX5) SFP link partners exhibit a disinclination to
> autonegotiate with X550 configured in SFI mode. This patch enables
> a manual AN-37 restart to work around the problem.
So it appears like its disabling autonegotiation.
> I quote the end of his message:
> "An amazon employee states reverting this commit and recompiling the
> kernel allows their similar network hardware to use the current in-tree
> 6.1 ixgbe driver. Otherwise as stated in the VyOS forum thread linked
> above compiling the linux kernel with the out-of-tree intel ixgbe driver
> 5.19.6 works too."
>
>
> > 1. The kernel message logs from when you bring up the interface. You can
> get this from dmesg or journalctl -k if you have systemd.
>
> The kernel returns only the following three lines after a "ip link set
> eno1 up" :
> mai 04 12:01:21 servyo kernel: pps pps5: new PPS source ptp5
> mai 04 12:01:21 servyo kernel: ixgbe 0000:0b:00.0: registered PHC device
> on eno1
> mai 04 12:01:21 servyo kernel: ixgbe 0000:0b:00.0 eno1: detected SFP+: 3
>
The logs show the device coming up and it detects the SFP, but we don't
see a link up status. Ok.
> > 2. "ethtool eno1" after you bring the interface up to see what it
> reports about link
>
> ethtool eno1
> Settings for eno1:
> Supported ports: [ FIBRE ]
> Supported link modes: 10000baseT/Full
> Supported pause frame use: Symmetric
> Supports auto-negotiation: No
> Supported FEC modes: Not reported
> Advertised link modes: 10000baseT/Full
> Advertised pause frame use: Symmetric
> Advertised auto-negotiation: No
> Advertised FEC modes: Not reported
> Speed: Unknown!
> Duplex: Unknown! (255)
> Auto-negotiation: off
> Port: Direct Attach Copper
> PHYAD: 0
> Transceiver: internal
> Supports Wake-on: d
> Wake-on: d
> Current message level: 0x00000007 (7)
> drv probe link
> Link detected: no
>
No link detected, but it does detect this is a 10GBaseT cable.
Interesting it doesn't report FEC or autonegotiation. Hmm.
>
> > 3. "ethtool -S eno1" to see if any other stats are reported that might
> help us isolate whats going on.
>
> ethtool -S eno1
> NIC statistics:
Snipped the stats. It looks like there wasn't much useful there. No
traffic was sent, and there is only this lsc_int count of 1, which
indicates that a check link status interrupt was fired.. but its only
triggered once.
>
> > Do you happen to know if any particular in-kernel driver version worked?
> > It would help limit the search for regressing commits.
>
> I can't retrieve the driver version itself via a “modinfo ixgbe” (no
> field mentions it) but the driver built into Debian 11 kernel
> 5.10.0-10-amd64 works perfectly. Debian 12's 6.1.76-amd64 and Debian
> 13's 6.6.15-amd64 are problematic. If you have a method of retrieving
> more precise information, I'd be delighted to provide it.
> The problem therefore “spread” between the release of Linux >5.10 and >=6.1.
>
Knowing the kernel is the important part, we don't have specific
versioning of drivers in the kernel anymore.
> On Linux 5.10.0-10, an ethtool returns this (the port works):
> ethtool eno1
> Settings for eno1:
> Supported ports: [ FIBRE ]
> Supported link modes: 10000baseT/Full
> Supported pause frame use: Symmetric
> Supports auto-negotiation: No
> Supported FEC modes: Not reported
> Advertised link modes: 10000baseT/Full
> Advertised pause frame use: Symmetric
> Advertised auto-negotiation: No
Interestingly, this does appear to still list autonegotation as disabled.
> Advertised FEC modes: Not reported
> Speed: 10000Mb/s
> Duplex: Full
> Auto-negotiation: off
> Port: Direct Attach Copper
> PHYAD: 0
> Transceiver: internal
> Supports Wake-on: d
> Wake-on: d
> Current message level: 0x00000007 (7)
> drv probe link
> Link detected: yes
>
>
> > Ideally, if you could use git bisect on the setup that could
> > efficiently locate what regressed the behavior.
>
> I really want to, but I have no idea how to go about it. Can you write
> me the command lines to satisfy your request?
>
The steps would require that you build the kernel manually. I can
outline the steps i would take here
1. get the kernel source from git.kernel.org. I place it in $HOME/git/linux
2. switch to v5.10 with 'git switch --detach v5.10'
2. copy the debian 5.10 config file to $HOME/git/linux/.config
3. build kernel with 'make -j24' (adjust -j depending on how much CPU
you want to spend building the kernel)
4. install with 'sudo make -j24 modules_install && sudo make install'
5. reboot and select the v5.10 kernel, double check it works.
6. in $HOME/git/linux run 'git bisect start' to initiate the bisect session.
7. First, label the current v5.10 commit as good with 'git bisect good'
8. Second, label the v6.1 commit as bad with 'git bisect bad v6.1'
This will initiate a bisect session and will checkout the kernel
approximately halfway between v5.10 and v6.1. For each bisection point
it checks, run the following steps:
1. 'make olddefconfig' to update the configuration for this version
2. 'make -j24' to rebuild with the current version
3. 'sudo make -j24 modules_install && sudo make install' to install this
version.
4. reboot into that version and check its behavior.
5. If it works properly then run 'git bisect good'
6. If it works incorrectly, then run 'git bisect bad'
A new commit will be selected. It will pick one between the latest good
point and the closest bad point, essentially honing in towards the
incorrect behavior.
If for any reason a commit can't be built or tested, you can use "git
bisect skip" and it will skip around a bit to find another point that
can be tried.
Its a lot, but it would help us hone in on the exact failure. I think
its ok if you can't do that. I am checking the out-of-tree and upstream
contents around that AN-37 commit.
The upstream implementation of ixgbe_setup_sfi_x550a is:
> static int ixgbe_setup_sfi_x550a(struct ixgbe_hw *hw, ixgbe_link_speed *speed)
> {
> struct ixgbe_mac_info *mac = &hw->mac;
> u32 reg_val;
> int status;
>
> /* Disable all AN and force speed to 10G Serial. */
> status = mac->ops.read_iosf_sb_reg(hw,
> IXGBE_KRM_PMD_FLX_MASK_ST20(hw->bus.lan_id),
> IXGBE_SB_IOSF_TARGET_KR_PHY, ®_val);
> if (status)
> return status;
>
> reg_val &= ~IXGBE_KRM_PMD_FLX_MASK_ST20_AN_EN;
> reg_val &= ~IXGBE_KRM_PMD_FLX_MASK_ST20_AN37_EN;
> reg_val &= ~IXGBE_KRM_PMD_FLX_MASK_ST20_SGMII_EN;
> reg_val &= ~IXGBE_KRM_PMD_FLX_MASK_ST20_SPEED_MASK;
>
> /* Select forced link speed for internal PHY. */
> switch (*speed) {
> case IXGBE_LINK_SPEED_10GB_FULL:
> reg_val |= IXGBE_KRM_PMD_FLX_MASK_ST20_SPEED_10G;
> break;
> case IXGBE_LINK_SPEED_1GB_FULL:
> reg_val |= IXGBE_KRM_PMD_FLX_MASK_ST20_SPEED_1G;
> break;
> default:
> /* Other link speeds are not supported by internal PHY. */
> return -EINVAL;
> }
>
> (void)mac->ops.write_iosf_sb_reg(hw,
> IXGBE_KRM_PMD_FLX_MASK_ST20(hw->bus.lan_id),
> IXGBE_SB_IOSF_TARGET_KR_PHY, reg_val);
>
> /* change mode enforcement rules to hybrid */
> (void)mac->ops.read_iosf_sb_reg(hw,
> IXGBE_KRM_FLX_TMRS_CTRL_ST31(hw->bus.lan_id),
> IXGBE_SB_IOSF_TARGET_KR_PHY, ®_val);
> reg_val |= 0x0400;
>
> (void)mac->ops.write_iosf_sb_reg(hw,
> IXGBE_KRM_FLX_TMRS_CTRL_ST31(hw->bus.lan_id),
> IXGBE_SB_IOSF_TARGET_KR_PHY, reg_val);
>
> /* manually control the config */
> (void)mac->ops.read_iosf_sb_reg(hw,
> IXGBE_KRM_LINK_CTRL_1(hw->bus.lan_id),
> IXGBE_SB_IOSF_TARGET_KR_PHY, ®_val);
> reg_val |= 0x20002240;
>
> (void)mac->ops.write_iosf_sb_reg(hw,
> IXGBE_KRM_LINK_CTRL_1(hw->bus.lan_id),
> IXGBE_SB_IOSF_TARGET_KR_PHY, reg_val);
>
> /* move the AN base page values */
> (void)mac->ops.read_iosf_sb_reg(hw,
> IXGBE_KRM_PCS_KX_AN(hw->bus.lan_id),
> IXGBE_SB_IOSF_TARGET_KR_PHY, ®_val);
> reg_val |= 0x1;
> (void)mac->ops.write_iosf_sb_reg(hw,
> IXGBE_KRM_PCS_KX_AN(hw->bus.lan_id),
> IXGBE_SB_IOSF_TARGET_KR_PHY, reg_val);
>
> /* set the AN37 over CB mode */
> (void)mac->ops.read_iosf_sb_reg(hw,
> IXGBE_KRM_AN_CNTL_4(hw->bus.lan_id),
> IXGBE_SB_IOSF_TARGET_KR_PHY, ®_val);
> reg_val |= 0x20000000;
>
> (void)mac->ops.write_iosf_sb_reg(hw,
> IXGBE_KRM_AN_CNTL_4(hw->bus.lan_id),
> IXGBE_SB_IOSF_TARGET_KR_PHY, reg_val);
>
> /* restart AN manually */
> (void)mac->ops.read_iosf_sb_reg(hw,
> IXGBE_KRM_LINK_CTRL_1(hw->bus.lan_id),
> IXGBE_SB_IOSF_TARGET_KR_PHY, ®_val);
> reg_val |= IXGBE_KRM_LINK_CTRL_1_TETH_AN_RESTART;
>
> (void)mac->ops.write_iosf_sb_reg(hw,
> IXGBE_KRM_LINK_CTRL_1(hw->bus.lan_id),
> IXGBE_SB_IOSF_TARGET_KR_PHY, reg_val);
>
> /* Toggle port SW reset by AN reset. */
> status = ixgbe_restart_an_internal_phy_x550em(hw);
>
> return status;
> }
The out-of-tree implementation appears to lack that change done by the
silicom folks.
> static s32 ixgbe_setup_sfi_x550a(struct ixgbe_hw *hw, ixgbe_link_speed *speed)
> {
> struct ixgbe_mac_info *mac = &hw->mac;
> s32 status;
> u32 reg_val;
>
> /* Disable all AN and force speed to 10G Serial. */
> status = mac->ops.read_iosf_sb_reg(hw,
> IXGBE_KRM_PMD_FLX_MASK_ST20(hw->bus.lan_id),
> IXGBE_SB_IOSF_TARGET_KR_PHY, ®_val);
> if (status != 0)
> return status;
>
> reg_val &= ~IXGBE_KRM_PMD_FLX_MASK_ST20_AN_EN;
> reg_val &= ~IXGBE_KRM_PMD_FLX_MASK_ST20_AN37_EN;
> reg_val &= ~IXGBE_KRM_PMD_FLX_MASK_ST20_SGMII_EN;
> reg_val &= ~IXGBE_KRM_PMD_FLX_MASK_ST20_SPEED_MASK;
>
> /* Select forced link speed for internal PHY. */
> switch (*speed) {
> case IXGBE_LINK_SPEED_10GB_FULL:
> reg_val |= IXGBE_KRM_PMD_FLX_MASK_ST20_SPEED_10G;
> break;
> case IXGBE_LINK_SPEED_1GB_FULL:
> reg_val |= IXGBE_KRM_PMD_FLX_MASK_ST20_SPEED_1G;
> break;
> default:
> /* Other link speeds are not supported by internal PHY. */
> return IXGBE_ERR_LINK_SETUP;
> }
>
> status = mac->ops.write_iosf_sb_reg(hw,
> IXGBE_KRM_PMD_FLX_MASK_ST20(hw->bus.lan_id),
> IXGBE_SB_IOSF_TARGET_KR_PHY, reg_val);
>
> /* Toggle port SW reset by AN reset. */
> status = ixgbe_restart_an_internal_phy_x550em(hw);
>
> return status;
> }
I suspect those changes must have broken the Cisco switch link behavior.
I unfortunately do not know enough about this hardware or the SFI
configuration to understand why this causes it.
If you don't want to try bisect, I would suggest trying to revert that
commit or simply replace the ixgbe_setup_sfi_x550a function with the one
from out-of-tree here. If you do that, you can rebuild just ixgbe with
"make M=drivers/net/ethernet/intel/ixgbe" and then insert the module
with "insmod drivers/net/ethernet/intel/ixgbe/ixgbe.ko".
It seems likely that this change had unintended side effect which broke
the Cisco switch linking.
I've added Jeff Daly, in the hopes that he could provide more details on
the change.
@Jeff, it seems likely that the change you made at 565736048bd5 ("ixgbe:
Manual AN-37 for troublesome link partners for X550 SFI") is breaking
some other switches. It would help if you could shed some light on this
change as otherwise we might need to revert it and once again break the
setup you fixed.
Thanks,
Jake
Powered by blists - more mailing lists