lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z0XEWGqLJ8okNSIr@shell.armlinux.org.uk>
Date: Tue, 26 Nov 2024 12:51:36 +0000
From: "Russell King (Oracle)" <linux@...linux.org.uk>
To: Andrew Lunn <andrew@...n.ch>, Heiner Kallweit <hkallweit1@...il.com>
Cc: Alexandre Torgue <alexandre.torgue@...s.st.com>,
	Andrew Lunn <andrew+netdev@...n.ch>,
	Bryan Whitehead <bryan.whitehead@...rochip.com>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Florian Fainelli <florian.fainelli@...adcom.com>,
	Jakub Kicinski <kuba@...nel.org>, Jose Abreu <joabreu@...opsys.com>,
	linux-arm-kernel@...ts.infradead.org,
	linux-stm32@...md-mailman.stormreply.com,
	Marcin Wojtas <marcin.s.wojtas@...il.com>,
	Maxime Coquelin <mcoquelin.stm32@...il.com>, netdev@...r.kernel.org,
	Oleksij Rempel <o.rempel@...gutronix.de>,
	Paolo Abeni <pabeni@...hat.com>, UNGLinuxDriver@...rochip.com
Subject: [PATCH RFC net-next 00/23] net: phylink managed EEE support

Hi,

Adding managed EEE support to phylink has been on the cards ever since
the idea in phylib was mooted. This overly large series attempts to do
so. I've included all the patches as it's important to get the driver
patches out there.

In doing this, I came across the fact that the addition of phylib
managed EEE support has actually broken a huge number of drivers -
phylib will now overwrite all members of struct ethtool_keee whether
the netdev driver wants it or not. This leads to weird scenarios where
doing a get_eee() op followed by a set_eee() op results in e.g.
tx_lpi_timer being zeroed, because the MAC driver doesn't know it needs
to initialise phylib's phydev->eee_cfg.tx_lpi_timer member. This mess
really needs urgently addressing, and I believe it came about because
Andrew's patches were only partly merged via another party - I guess
highlighting the inherent danger of "thou shalt limit your patch series
to no more than 15 patches" when one has a subsystem who's in-kernel
API is changing.

I am ignoring that limit for this posting precisely because of this.
I think we need to have a discussion about it, because if it ends up
causing breakage, then we're doing something wrong.

One of the drivers that got broken was stmmac, so this series also
includes a number of patches that fix it before converting stmmac to
phylink managed EEE. I can point to many many more that are similarly
broken.

Also inflating this series are two important patches that have been
submitted for the NET tree, but which aren't yet part of the net-next
tree - thus making this series larger than really necessary. If it
weren't for both of these issues, then this series would be exactly
15 patches.

Anyway, these patches...

Patch 1 and 2 are patches that have been submitted and possibly applied
to the net tree.

Patch 3 changes the Marvell driver to use the state we store in
struct phy_device, rather than manually calling
phydev->eee_cfg.eee_enabled.

Patch 4 avoids genphy_c45_ethtool_get_eee() setting ->eee_enabled, as
we copy that from phydev->eee_cfg.eee_enabled later, and after patch 3
mo one uses this after calling genphy_c45_ethtool_get_eee(). In fact,
the only caller of this function now is phy_ethtool_get_eee().

As all callers to genphy_c45_eee_is_active() now pass NULL as its
is_enabled flag, this is no longer useful. Remove the argument in
patch 5.

Patch 6 updates the phylib documentation to make it absolutely clear
that phy_ethtool_get_eee() now fills in all members of struct
ethtool_keee, which is why we now have so many buggy network drivers.
We need to decide how to fix this mess.

Patch 7 adds a definition for the clock stop capable bit in the PCS
MMD status register.

Patch 8 adds a phylib API to query whether the PHY allows the transmit
xMII clock to be stopped while in LPI mode. This capability is for MAC
drivers to save power when LPI is active, to allow them to stop their
transmit clock.

Patch 9 adds another phylib API to configure whether the receive xMII
clock may be disabled by the PHY. We do have an existing API,
phy_init_eee(), but... it only allows the control bit to be set which
is weird - what if a boot firmware or previous kernel has set this bit
and we want it clear?

Patch 10 finally starts on the phylink parts of this, extracting from
phylink_resolve() the detection of link-up. (Yes, okay, I could've
dropped this patch, but with 23 patches, it's not going to make that
much difference.)

Patch 11 adds phylink managed EEE support. Two new MAC APIs are added,
to enable and disable LPI. The enable method is passed the LPI timer
setting which it is expected to program into the hardware, and also a
flag ehther the transmit clock should be stopped.

 *** There are open questions here. Eagle eyed reviewers will notice
   pl->config->lpi_interfaces. There are MACs out there which only
   support LPI signalling on a subset of their interface types. Phylib
   doesn't understand this. I'm handling this at the moment by simply
   not activating LPI at the MAC, but that leads to ethtool --show-eee
   suggesting that EEE is active when it isn't.
 *** Should we pass the phy_interface_t to these functions?
 *** Should mac_enable_tx_lpi() be allowed to fail if the MAC doesn't
   support the interface mode?

An example of a MAC that this is the case are the Marvell ones - both
NETA and PP2 only support LPI signalling when connected via SGMII,
which makes being connected to a PHY which changes its link mode
problematical.

The remainder of the patches address the driver sides, which are
necessary to actually test this stuff out. The exception are the stmmac
patches.

The first four stmmac patches show what is necessary across many drivers
to fix the current phylib EEE mess.

The 5th stmmac patch makes reporting of EEE errors dependent on whether
EEE is supported by stmmac or not - I can't see why one would want
anything else (maybe someone can enlighten me?)

The 6th stmmac patch converts to use phy_eee_rx_clock_stop(), thereby
ensuring that, if desired, the RX clock will not be stopped by the PHY
when in LPI mode (which as noted above is something that phy_init_eee()
doesn't do.) Given that we know stmmac has issues if the RX clock is
stopped, this could be a bug fix.

The final patch converts stmmac to phylink managed EEE.

 drivers/net/ethernet/marvell/mvneta.c              | 118 ++++++++++--------
 drivers/net/ethernet/marvell/mvpp2/mvpp2.h         |   5 +
 drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c    |  88 ++++++++++++++
 drivers/net/ethernet/microchip/lan743x_ethtool.c   |  21 ----
 drivers/net/ethernet/microchip/lan743x_main.c      |  39 ++++--
 drivers/net/ethernet/microchip/lan743x_main.h      |   1 -
 drivers/net/ethernet/stmicro/stmmac/stmmac.h       |   1 -
 .../net/ethernet/stmicro/stmmac/stmmac_ethtool.c   |  25 +---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |  68 ++++++++---
 drivers/net/phy/marvell.c                          |   4 +-
 drivers/net/phy/phy-c45.c                          |  15 +--
 drivers/net/phy/phy.c                              | 106 +++++++++++-----
 drivers/net/phy/phylink.c                          | 134 +++++++++++++++++++--
 include/linux/phy.h                                |   6 +-
 include/linux/phylink.h                            |  44 +++++++
 include/uapi/linux/mdio.h                          |   1 +
 16 files changed, 505 insertions(+), 171 deletions(-)

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ