[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241125094622.65f0bb97@fedora.home>
Date: Mon, 25 Nov 2024 09:46:22 +0100
From: Maxime Chevallier <maxime.chevallier@...tlin.com>
To: Andrew Lunn <andrew@...n.ch>
Cc: Oleksij Rempel <o.rempel@...gutronix.de>, netdev@...r.kernel.org,
pabeni@...hat.com, kuba@...nel.org, edumazet@...gle.com,
davem@...emloft.net, Russell King <linux@...linux.org.uk>, Romain Gantois
<romain.gantois@...tlin.com>, Florian Fainelli <f.fainelli@...il.com>,
Vladimir Oltean <olteanv@...il.com>, Heiner Kallweit
<hkallweit1@...il.com>, Kory Maincent <kory.maincent@...tlin.com>, Thomas
Petazzoni <thomas.petazzoni@...tlin.com>
Subject: Re: Moving forward with stacked PHYs
Hi Andrew,
On Thu, 21 Nov 2024 15:00:00 +0100
Andrew Lunn <andrew@...n.ch> wrote:
> > > On that front, there's more work coming to improve on that :
> > > - It would be good if the stats reported through ethtool would account
> > > for all the PHYs on the link, so that we would get a full overview
> > > of all stats from all components of the link
> >
> > Ack, i'm working right now on the standardized PHY stats. With some work on
> > kernel side it would be possible to get with something like this:
> > ethtool -S eth1 --all-groups
>
> Naming becomes interesting, when you have multiple PHYs. You need to
> indicate which PHY the stats are from. So it will need nested netlink
> properties. I don't think we have anything like that at the moment for
> ethtool statistics, but it does exist in other places, e.g. cable
> testing results.
>
> > > - A netlink notification to indicate that a new PHY was found on the
> > > link is also in the plans.
> >
> > This will be cool. I was thinking about netlink notification for some
> > health issues, to avoid polling on the user side.
>
> We need to think about when we send the notification. During
> enumeration of the MDIO bus, during probe, or when the PHY is
> connected to its upstream?
The way I see this is based on the phy_link_topology. What is done
currently is that the SFP PHY is added to the topology when :
- The .connect_phy() SFP upstream op is called, but ONLY if the
upstream PHY is attached to its netdev (otherwise, the upstream PHY
isn't in the topology)
- Alternatively if the SFP phy is already attached to its upstream,
both the upstream PHY and the SFP PHY will be added to the tpopology
when the upstream PHY gets attached to its netdev.
The notification would be sent at that time. We can't really send it
before the PHYs are part of the topology because at that point we don't
know to which netdev it belongs.
> What are the userspaces user cases for this
> notification?
The way I see that, based on the appereance of PHYs, userspace may want
to re-ajust configuration, especially if :
- The PHY is attached in .ndo_open() and
- The PHY provides some kind of offloading capability (Timestamping,
maybe more such as macsec)
In that case, it's possible that userspace is interested in knowing
that a new PHY is here to re-adjust the offloads to the PHY.
But maybe a more correct approach would be a per-feature notif, such as
"there's a new timestamper on the link".
Some user also reported not so long ago the need to know when the SFP
PHY was attached to reconfigure the advertising.
> > > There's a lot more work to do however, as these user-facing commands aren't
> > > the only place where netdev->phydev is used.
> > >
> > > The first place are the MAC drivers. For this, there seems to be 2 options :
> > >
> > > - move to phylink. The phylink API is pretty well designed as it makes
> > > the phydev totally irrelevant for the MAC driver. The MAC driver doesn't
> > > even need to know if it's connected to a PHY or something else (and SFP
> > > cage for example). That's nice, but there are still plenty of drivers
> > > that don't use phylink.
> >
> > This would be nice, but this is too much work. Last week I was porting lan78xx
> > to PHYlink. Plain porting is some hours of work, the 80% of time is
> > testing and fixing different issues. I do no think there are enough
> > resources to port all of drivers.
>
> It is clear that we the Maintainers cannot convert all MAC drivers to
> phylink. But i would be happy to say if you want to support multiple
> PHYs, you need to use phylink. To some extent, it is already that
> way. I don't think there are any systems with SFPs using phylib. So i
> would not change the phylib API. Keep netdev->phydev meaning the first
> PHY, and maybe encode the assumption that it is the only PHY with a
> netdev_warn() if that assumption gets violated.
This is fine by me, this makes the path towards better isolation
between netdev and phydev much clearer, thanks Andrew :)
> > > There are other parts of the kernel that accesses the netdev->phydev. There
> > > are a few places in DSA where this is done.
>
> Many of the DSA drivers are now phylink, not phylib. Any using SFPs
> will be phylink. I would leave those using phylib alone, same as other
> MAC drivers.
No problem
> > > Finally, there's the whole work of actually configuring the PHY themselves
> > > correctly when they are chained to one another, and getting the logic right
> > > for the link-up/down detection, getting the ksettings values aggregated
> > > correctly, etc.
> > >
> > > We have some local patches as well for that, to handle the stacked PHY
> > > state-machine synchronisation, link-reporting and negociation but
> > > it's still WIP to cover all the corner-cases and hard-to-test features, for
> > > example how to deal with WoL or EEE in these situations.
> >
> > I assume, even with stacked PHYs, some kind of power management would
> > be needed. The WoL is probably less interesting, since all attached PHYs are
> > under linux control, so we can suspend or resume them.
> >
> > The EEE on other hand, may help to reduce run time power consumption.
> > Still, it the user space should know about it, since it may be critical
> > for time critical tasks.
>
> We also need to consider where in the chain EEE, WoL, Pause etc are
> relevant. Both EEE and Pause it about negotiation, and that happen in
> the PHY closet to the media. WoL in theory could happen anywhere, but
> ideally you want it as close to the media as possible, so you can
> power off intermediary PHYs. But can PHYs in SFP actually wake the
> system? I don't think i have ever seen that, there is no interrupt pin
> on the SFP cage. However WoL is currently a mess and MAC driver
> writers get is wrong. I would say WoL is another area we need a new
> API moving as much code into the core as possible, same as we have
> done for EEE, phylink handling of Pause etc. But that is probably out
> of scope for this work, but should be kept in mind.
Indeed, to me at least it's a bit out of scope, but yes that's
something worth keeping in mind.
Thanks a lot for the answers Andrew and Oleksij.
Best regards,
Maxime
Powered by blists - more mailing lists