lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+_ehUyDOE-4_FD42BHKXjyT2kWxxWtpy_+HU2bwZXu9TRE7eg@mail.gmail.com>
Date: Fri, 11 Jul 2025 01:44:30 +0200
From: "Christian Marangi (Ansuel)" <ansuelsmth@...il.com>
To: Simon Horman <horms@...nel.org>
Cc: Sean Anderson <sean.anderson@...ux.dev>, Daniel Golle <daniel@...rotopia.org>, 
	netdev@...r.kernel.org, Andrew Lunn <andrew+netdev@...n.ch>, 
	"David S . Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, 
	Maxime Chevallier <maxime.chevallier@...tlin.com>, Russell King <linux@...linux.org.uk>, 
	Vineeth Karumanchi <vineeth.karumanchi@....com>, Heiner Kallweit <hkallweit1@...il.com>, 
	linux-kernel@...r.kernel.org, Kory Maincent <kory.maincent@...tlin.com>, 
	Lei Wei <quic_leiwei@...cinc.com>, Michal Simek <michal.simek@....com>, 
	Radhey Shyam Pandey <radhey.shyam.pandey@....com>, Robert Hancock <robert.hancock@...ian.com>, 
	John Crispin <john@...ozen.org>, Felix Fietkau <nbd@....name>, Robert Marko <robimarko@...il.com>
Subject: Re: [RFC] comparing the propesed implementation for standalone PCS drivers

Il giorno mer 9 lug 2025 alle ore 15:52 Simon Horman
<horms@...nel.org> ha scritto:
>
> On Fri, Jun 13, 2025 at 12:06:23PM -0400, Sean Anderson wrote:
> > On 6/13/25 08:55, Daniel Golle wrote:
> > > Hi netdev folks,
> > >
> > > there are currently 2 competing implementations for the groundworks to
> > > support standalone PCS drivers.
> > >
> > > https://patchwork.kernel.org/project/netdevbpf/list/?series=970582&state=%2A&archive=both
> > >
> > > https://patchwork.kernel.org/project/netdevbpf/list/?series=961784&state=%2A&archive=both
> > >
> > > They both kinda stalled due to a lack of feedback in the past 2 months
> > > since they have been published.
> > >
> > > Merging the 2 implementation is not a viable option due to rather large
> > > architecture differences:
> > >
> > >                             | Sean                  | Ansuel
> > > --------------------------------+-----------------------+-----------------------
> > > Architecture                        | Standalone subsystem  | Built into phylink
> > > Need OPs wrapped            | Yes                   | No
> > > resource lifecycle          | New subsystem         | phylink
> > > Supports hot remove         | Yes                   | Yes
> > > Supports hot add            | Yes (*)               | Yes
> > > provides generic select_pcs | No                    | Yes
> > > support for #pcs-cell-cells | No                    | Yes
> > > allows migrating legacy drivers     | Yes                   | Yes
> > > comes with tested migrations        | Yes                   | No
> > >
> > > (*) requires MAC driver to also unload and subsequent re-probe for link
> > > to work again
> > >
> > > Obviously both architectures have pros and cons, here an incomplete and
> > > certainly biased list (please help completing it and discussing all
> > > details):
> > >
> > > Standalone Subsystem (Sean)
> > >
> > > pros
> > > ====
> > >  * phylink code (mostly) untouched
> > >  * doesn't burden systems which don't use dedicated PCS drivers
> > >  * series provides tested migrations for all Ethernet drivers currently
> > >    using dedicated PCS drivers
> > >
> > > cons
> > > ====
> > >  * needs wrapper for each PCS OP
> > >  * more complex resource management (malloc/free)
> > >  * hot add and PCS showing up late (eg. due to deferred probe) are
> > >    problematic
> > >  * phylink is anyway the only user of that new subsystem
> >
> > I mean, if you want I can move the whole thing to live in phylink.c, but
> > that just enlarges the kernel if PCSs are not being used. The reverse
> > criticism can be made for Ansuel's series: most phylink users do not
> > have "dynamic" PCSs but the code is imtimately integrated with phylink
> > anyway.
>
> At the risk of stating the obvious it seems to me that a key decision
> that needs to be made is weather a new subsystem is the correct direction.
>

If you want to expand it a bit it's about new subsystem + making things
more deterministic.

> If I understand things correctly it seems that not creating a new subsystem
> is likely to lead to a simpler implementation, at least in the near term.
> While doing so lends itself towards greater flexibility in terms of users,
> I'd suggest a cleaner abstraction layer, and possibly a smaller footprint
> (I assume space consumed by unused code) for cases where PCS is not used.
>

Funnily enough almost all implementation have an attached PCS either
if it's something very basic or it's something more advanced (normally
this is 100% of the case when 10g is supported)

Soo case where PCS is not used are very little and in the case where
it's not used it's just an empty pointer and some bitmask for PHY
interface.

> On the last point, I do wonder if there are other approaches to managing
> the footprint. And if so, that may tip the balance towards a new subsystem.
>
>
> Another way of framing this is: Say, hypothetically, Sean was to move his
> implementation into phylink.c. Then we might be able to have a clearer
> discussion of the merits of each implementation. Possibly driving towards
> common ground. But it seems hard to do so if we're unsure if there should
> be a new subsystem or not.
>

Honestly speaking this case is very similar to some situation where Russell
had to intervene as the implementation reached criticality (a recent example is
EEE where the only solution was to provide to phylink more info so correct
decision could be made preventing MAC driver doing strange broken stuff)

I'm still with the idea that PCS handling in phylink should be improved.
For example there is a big problem where phylink doesn't exactly know
what interface are supported from PCS or MAC with the MAC driver
implement the common pattern of ORing the interface supported by MAC
and by the different PCS.

I feel that even if the wrapper solution gets accepted, phylink requires a
big overhaul for PCS handling. (And Russell more or less already started
it with filling some condition when the select_pcs fails when the interface
change)

Things are getting complex enough that in some scenarios the PCS
might fail calibration or might """explode"""" after a while and phylink
is currently not designed for that.

And also worth considering that for 1gigabit connection it's possible
that something will fallback from usxgmii to sgmii in this extreme case
and I feel phylink should be able to handle that smoothly.

This is really just to give some context hoping it gets some traction
on why we really need to start fixing the problem and putting effort
on it. (my opinion is that it will only get worse, I'm scared to see
the complexity of things when 10g+ stuff will reach consumer or
prosumer market)

> > > phylink-managed standalone PCS drivers (Ansuel)
> > >
> > > pros
> > > ====
> > >  * trivial resource management
> >
> > Actually, I would say the resource management is much more complex and
> > difficult to follow due to being spread out over many different
> > functions.
> >
> > >  * no wrappers needed
> > >  * full support for hot-add and deferred probe
> > >  * avoids code duplication by providing generic select_pcs
> > >    implementation
> > >  * supports devices which provide more than one PCS port per device
> > >    ('#pcs-cell-cells')
> > >
> > > cons
> > > ====
> > >  * inclusion in phylink means more (dead) code on platforms not using
> > >    dedicated PCS
> > >  * series does not provide migrations for existing drivers
> > >    (but that can be done after)
> > >  * probably a bit harder to review as one needs to know phylink very well
> > >
> > >
> > > It would be great if more people can take a look and help deciding the
> > > general direction to go.
> >
> > I also encourage netdev maintainers to have a look; Russell does not
> > seem to have the time to review either system.
> >
> > > There are many drivers awaiting merge which require such
> > > infrastructure (most are fine with either of the two), some for more
> > > than a year by now.
> >
> > This is the major thing. PCS drivers should have been supported from the
> > start of phylink, and the longer there is no solution the more legacy
> > code there is to migrate.
>
> This seems to be something we can all agree on :)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ