lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZwW8-Xi8sStL50uw@makrotopia.org>
Date: Wed, 9 Oct 2024 00:15:05 +0100
From: Daniel Golle <daniel@...rotopia.org>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: Andrew Lunn <andrew@...n.ch>, Heiner Kallweit <hkallweit1@...il.com>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next] net: phy: realtek: check validity of 10GbE
 link-partner advertisement

Hi Russell,

On Tue, Oct 08, 2024 at 03:27:21PM +0100, Russell King (Oracle) wrote:
> Okay, I think the problem is down to the order in which Realtek is
> doing stuff.
> [...]
> Now, rtl822x_read_status() reads the 10G status, modifying
> phydev->lp_advertising before then going on to call
> rtlgen_read_status(), which then calls genphy_read_status(), which
> in turn will then call genphy_read_lpa().
> 
> First, this is the wrong way around. Realtek needs to call
> genphy_read_status() so that phydev->link and phydev->autoneg_complete
> are both updated to the current status.

First of all thanks a lot for diving down that rabbit hole with me!

> 
> Then, it needs to check whether AN is enabled, and whether autoneg
> has completed and deal with both situations.
> 
> Afterwards, it then *possibly* needs to read its speed register and
> decode that to phydev->speed, but I don't see the point of that when
> it's (a) not able to also decode the duplex from that register, and
> (b) when we've already resolved it ourselves from the link mode.
> What I'd be worried about is if the PHY does a down-shift to a
> different speed _and_ duplex from what was resolved - and thus
> whether we should even be enabling downshift on this PHY. Maybe
> there's a bit in 0xa43 0x12 that gives us the duplex as well?
> 
> In other words:
> 
> static int rtl822x_read_status(struct phy_device *phydev)
> {
> 	int lpadv, ret;
> 
> 	ret = rtlgen_read_status(phydev);
> 	if (ret < 0)
> 		return ret;
> 
> 	if (phydev->autoneg == AUTONEG_DISABLE)
> 		return 0;
> 
> 	if (!phydev->autoneg_complete) {
> 		mii_10gbt_stat_mod_linkmode_lpa_t(phydev->lp_advertising, 0);
> 		return 0;
> 	}
> 
> 	lpadv = phy_read_paged(phydev, 0xa5d, 0x13);
> 	if (lpadv < 0)
> 		return lpadv;
> 
> 	mii_10gbt_stat_mod_linkmode_lpa_t(phydev->lp_advertising, lpadv);
> 	phy_resolve_aneg_linkmode(phydev);
> 
> 	return 0;
> }
> 
> That should at least get proper behaviour in the link partner
> advertising bitmap rather than the weirdness that Realtek is doing.
> (BTW, other drivers should be audited for the same bug!)

Got it, always do genphy_read_status() first thing, as that will
clear things and set autoneg_complete.

Similarly, when dealing with the same PHY in C45 mode, I noticed that
phy->autoneg_complete never gets set, but rather we have to check it
via genphy_c45_aneg_done(phydev) and clear bits set by
mii_stat1000_mod_linkmode_lpa_t().

Doing so for C45 access, and following your suggestion above for C22
resolves the issue without any need to check MDIO_AN_10GBT_STAT_LOCOK
or MDIO_AN_10GBT_STAT_REMOK.

> [...]
> However, if we keep the rtlgen_decode_speed() stuff, and can fix the
> duplex issue, then the phy_resolve_aneg_linkmode() calls should not
> be necessary, and it should be moved _after_ this to ensure that
> phydev->speed (and phydev->duplex) are correctly set.

PHY Specific Status Register, MMD 31.0xA434 also carries duplex
information in bit 3 as well as more useful information.
Probably rtlgen_decode_speed() should be renamed to rtlgen_decode_physr()
and decode most of that.

I'll post a series taking care of all of that shortly.


Again, thanks a lot for the extremely insightful lesson!


Cheers


Daniel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ