[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c84518eb-15da-4356-ac6a-b2fcb807d92f@linux.dev>
Date: Thu, 10 Jul 2025 18:50:16 -0400
From: Sean Anderson <sean.anderson@...ux.dev>
To: Simon Horman <horms@...nel.org>
Cc: Daniel Golle <daniel@...rotopia.org>, netdev@...r.kernel.org,
Andrew Lunn <andrew+netdev@...n.ch>, "David S . Miller"
<davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Maxime Chevallier <maxime.chevallier@...tlin.com>,
Russell King <linux@...linux.org.uk>,
Vineeth Karumanchi <vineeth.karumanchi@....com>,
Heiner Kallweit <hkallweit1@...il.com>, linux-kernel@...r.kernel.org,
Kory Maincent <kory.maincent@...tlin.com>,
Christian Marangi <ansuelsmth@...il.com>, Lei Wei <quic_leiwei@...cinc.com>,
Michal Simek <michal.simek@....com>,
Radhey Shyam Pandey <radhey.shyam.pandey@....com>,
Robert Hancock <robert.hancock@...ian.com>, John Crispin <john@...ozen.org>,
Felix Fietkau <nbd@....name>, Robert Marko <robimarko@...il.com>
Subject: Re: [RFC] comparing the propesed implementation for standalone PCS
drivers
On 7/9/25 09:52, Simon Horman wrote:
> On Fri, Jun 13, 2025 at 12:06:23PM -0400, Sean Anderson wrote:
>> On 6/13/25 08:55, Daniel Golle wrote:
>> > Hi netdev folks,
>> >
>> > there are currently 2 competing implementations for the groundworks to
>> > support standalone PCS drivers.
>> >
>> > https://patchwork.kernel.org/project/netdevbpf/list/?series=970582&state=%2A&archive=both
>> >
>> > https://patchwork.kernel.org/project/netdevbpf/list/?series=961784&state=%2A&archive=both
>> >
>> > They both kinda stalled due to a lack of feedback in the past 2 months
>> > since they have been published.
>> >
>> > Merging the 2 implementation is not a viable option due to rather large
>> > architecture differences:
>> >
>> > | Sean | Ansuel
>> > --------------------------------+-----------------------+-----------------------
>> > Architecture | Standalone subsystem | Built into phylink
>> > Need OPs wrapped | Yes | No
>> > resource lifecycle | New subsystem | phylink
>> > Supports hot remove | Yes | Yes
>> > Supports hot add | Yes (*) | Yes
>> > provides generic select_pcs | No | Yes
>> > support for #pcs-cell-cells | No | Yes
>> > allows migrating legacy drivers | Yes | Yes
>> > comes with tested migrations | Yes | No
>> >
>> > (*) requires MAC driver to also unload and subsequent re-probe for link
>> > to work again
>> >
>> > Obviously both architectures have pros and cons, here an incomplete and
>> > certainly biased list (please help completing it and discussing all
>> > details):
>> >
>> > Standalone Subsystem (Sean)
>> >
>> > pros
>> > ====
>> > * phylink code (mostly) untouched
>> > * doesn't burden systems which don't use dedicated PCS drivers
>> > * series provides tested migrations for all Ethernet drivers currently
>> > using dedicated PCS drivers
>> >
>> > cons
>> > ====
>> > * needs wrapper for each PCS OP
>> > * more complex resource management (malloc/free)
>> > * hot add and PCS showing up late (eg. due to deferred probe) are
>> > problematic
>> > * phylink is anyway the only user of that new subsystem
>>
>> I mean, if you want I can move the whole thing to live in phylink.c, but
>> that just enlarges the kernel if PCSs are not being used. The reverse
>> criticism can be made for Ansuel's series: most phylink users do not
>> have "dynamic" PCSs but the code is imtimately integrated with phylink
>> anyway.
>
> At the risk of stating the obvious it seems to me that a key decision
> that needs to be made is weather a new subsystem is the correct direction.
>
> If I understand things correctly it seems that not creating a new subsystem
> is likely to lead to a simpler implementation, at least in the near term.
It's really more of an unusual PCS driver with some routines for
registering and looking up devices. I would like to note that Ansuel's
approach has those same registration and lookup functions.
> While doing so lends itself towards greater flexibility in terms of users,
> I'd suggest a cleaner abstraction layer, and possibly a smaller footprint
> (I assume space consumed by unused code) for cases where PCS is not used.
I think the greatest strength of my implementation is its clean
interface. The rest of phylink doesn't know or care whether the PCS is a
traditional one (tied to the lifetime of the netdev) or whether it is
dynamically looked up.
> On the last point, I do wonder if there are other approaches to managing
> the footprint. And if so, that may tip the balance towards a new subsystem.
>
>
> Another way of framing this is: Say, hypothetically, Sean was to move his
> implementation into phylink.c. Then we might be able to have a clearer
> discussion of the merits of each implementation. Possibly driving towards
> common ground. But it seems hard to do so if we're unsure if there should
> be a new subsystem or not.
I really think it's just cosmetic. For example, in my implementation we have
/* pcs/core.c */
static void pcs_get_state(struct phylink_pcs *pcs, unsigned int neg_mode,
struct phylink_link_state *state)
{
struct pcs_wrapper *wrapper = pcs_to_wrapper(pcs);
struct phylink_pcs *wrapped;
guard(srcu)(&pcs_srcu);
wrapped = srcu_dereference(wrapper->wrapped, &pcs_srcu);
if (wrapped)
wrapped->ops->pcs_get_state(wrapped, neg_mode, state);
else
state->link = 0;
}
/* phylink.c */
static void phylink_mac_pcs_get_state(struct phylink *pl,
struct phylink_link_state *state)
{
struct phylink_pcs *pcs;
/* ... snip ... */
pcs = pl->pcs;
if (pcs)
pcs->ops->pcs_get_state(pcs, pl->pcs_neg_mode, state);
else
state->link = 0;
}
and that would turn into
/* phylink.c */
static void phylink_mac_pcs_get_state(struct phylink *pl,
struct phylink_link_state *state)
{
struct pcs_wrapper *wrapper = pcs_to_wrapper(pcs);
struct phylink_pcs *pcs;
/* ... snip ... */
guard(srcu)(&pcs_srcu);
if (pl->pcs->ops == &pcs_wrapper_ops)
pcs = srcu_dereference(wrapper->wrapped, &pcs_srcu);
else
pcs = pl->pcs;
if (pcs)
pcs->ops->pcs_get_state(pcs, pl->pcs_neg_mode, state);
else
state->link = 0;
}
and TBH I like the former much better since we avoid special-casing the
wrapper stuff. We still have to do the wrapper stuff because the MAC
owns the PCS and we can't prevent it from passing phylink a stale PCS
pointer. Now, we could make phylink own the PCS, but that means going
with Ansuel's approach. And the main problem phylink owning the PCS is
that it complicates lookup for existing MACs that need to accomodate a
variety of nonstandard ways of looking up a PCS for backwards-
compatibility. The only real way to do it is something like
/* In mac_probe() or whatever */
scoped_guard(mutex)(&pcs_remove_lock) {
/* Just imagine some terrible contortions for compatibility here */
struct phylink_pcs *pcs = pcs_get(dev, "my_pcs");
if (IS_ERR(pcs))
return PTR_ERR(pcs);
list_add(pcs->list, &config.pcs_list);
ret = phylink_create(config, dev->fwnode, interface,
&mac_phylink_ops);
if (ret)
return ret;
}
/* At this point the PCS could have already been removed */
but even then the MAC has no idea how to mux the correct PCS. If you
have more than one dynamically-looked-up PCS they can't be
differentiated because they are both opaque pointers that may point to
stale memory at any time.
This is why I favor a wrapper approach because we can allocate some
memory that's tied to the lifetime of the MAC rather than the lifetime
of the PCS. Then we don't have to worry about whether the PCS is still
valid and we can get on with our lives.
--Sean
>> > phylink-managed standalone PCS drivers (Ansuel)
>> >
>> > pros
>> > ====
>> > * trivial resource management
>>
>> Actually, I would say the resource management is much more complex and
>> difficult to follow due to being spread out over many different
>> functions.
>>
>> > * no wrappers needed
>> > * full support for hot-add and deferred probe
>> > * avoids code duplication by providing generic select_pcs
>> > implementation
>> > * supports devices which provide more than one PCS port per device
>> > ('#pcs-cell-cells')
>> >
>> > cons
>> > ====
>> > * inclusion in phylink means more (dead) code on platforms not using
>> > dedicated PCS
>> > * series does not provide migrations for existing drivers
>> > (but that can be done after)
>> > * probably a bit harder to review as one needs to know phylink very well
>> >
>> >
>> > It would be great if more people can take a look and help deciding the
>> > general direction to go.
>>
>> I also encourage netdev maintainers to have a look; Russell does not
>> seem to have the time to review either system.
>>
>> > There are many drivers awaiting merge which require such
>> > infrastructure (most are fine with either of the two), some for more
>> > than a year by now.
>>
>> This is the major thing. PCS drivers should have been supported from the
>> start of phylink, and the longer there is no solution the more legacy
>> code there is to migrate.
>
> This seems to be something we can all agree on :)
Powered by blists - more mailing lists