netdev - Re: [PATCH net 4/4] docs: net: dsa: replace TODO section with info about history and devel ideas

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231209015805.zilrf5wrtlccixyw@skbuf>
Date: Sat, 9 Dec 2023 03:58:05 +0200
From: Vladimir Oltean <vladimir.oltean@....com>
To: Florian Fainelli <f.fainelli@...il.com>
Cc: netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Andrew Lunn <andrew@...n.ch>,
	Luiz Angelo Daros de Luca <luizluca@...il.com>,
	Alvin ┼áipraga <alsi@...g-olufsen.dk>,
	Madhuri Sripada <madhuri.sripada@...rochip.com>,
	Marcin Wojtas <mw@...ihalf.com>,
	Linus Walleij <linus.walleij@...aro.org>,
	Tobias Waldekranz <tobias@...dekranz.com>,
	Arun Ramadoss <arun.ramadoss@...rochip.com>,
	"Russell King (Oracle)" <rmk+kernel@...linux.org.uk>,
	Jonathan Corbet <corbet@....net>
Subject: Re: [PATCH net 4/4] docs: net: dsa: replace TODO section with info
 about history and devel ideas

On Fri, Dec 08, 2023 at 03:03:35PM -0800, Florian Fainelli wrote:
> > +Probing through ``platform_data`` remains limited in functionality. The
> > +``ds->dev->of_node`` and ``dp->dn`` pointers are NULL, and the OF API calls
> > +made by drivers for discovering more complex setups fall back to the implicit
> > +handling. There is no way to describe multi-chip trees, or switches with
> > +multiple CPU ports. It is always assumed that shared ports are configured by
> > +the driver to the maximum supported link speed (they do not use phylink).
> > +User ports cannot connect to arbitrary PHYs, but are limited to
> > +``ds->user_mii_bus``.
> 
> Maybe a mention here that this implies built-in/internal PHY devices only,
> just as a way to re-iterate the limitation and to echo to the previous
> patch?

I am not fully convinced that saying "user_mii_bus can only access internal PHYs"
only would be correct. This paragraph also exists in the user MDIO bus section:

For Ethernet switches which have both external and internal MDIO buses, the
user MII bus can be utilized to mux/demux MDIO reads and writes towards either
internal or external MDIO devices this switch might be connected to: internal
PHYs, external PHYs, or even external switches.

> > +
> > +Many switch drivers introduced since after DSA's second OF binding were not
> > +designed to support probing through ``platform_data``. Most notably,
> > +``device_get_match_data()`` and ``of_device_get_match_data()`` return NULL with
> > +``platform_data``, so generally, drivers which do not have alternative
> > +mechanisms for this do not support ``platform_data``.
> > +
> > +Extending the ``platform_data`` support implies adding more separate code.
> > +An alternative worth exploring is growing DSA towards the ``fwnode`` API.
> > +However, not the entire OF binding should be generalized to ``fwnode``.
> > +The current bindings must be examined with a critical eye, and the properties
> > +which are no longer considered good practice (like ``label``, because ``udev``
> > +offers this functionality) should first be deprecated in OF, and not migrated
> > +to ``fwnode``.
> > +
> > +With ``fwnode`` support in the DSA framework, the ``fwnode_create_software_node()``
> > +API could be used as an alternative to ``platform_data``, to allow describing
> > +and probing switches on non-OF.
> 
> Might suggest to move the 3 paragraphs towards the end because otherwise it
> might be a distraction for the reader who might think: ah that's it? no more
> technical details!? Looks like Linus made the same suggestion in his review.

I think it needs even more rethinking than that. I now remembered that
we also have a "Design limitations" section where the future work can go.

It's hard to navigate through what is now a 1400 line document and not
get lost.

I'm hoping I could move the documentation of variables and methods that
now sits in "Driver development" into kdoc comments inline with the code,
to reduce the clutter a bit.

But I don't know how to tackle this. Should documentation changes go to
"net" or to "net-next"? I targeted this for "net" as a documentation-only
change set. But if I start adding kdocs, it won't be so clear-cut anymore...

> > +Simplifying the device tree bindings to require a single ``link`` phandle
> > +cannot be done without rethinking the distributed probing scheme. One idea is
> > +to reinstate the switch tree as a platform device, but this time created
> > +dynamically by ``dsa_register_switch()`` if the switch's tree ID is not already
> > +present in the system. The switch tree driver walks the device tree hop by hop,
> > +following the ``link`` references, to discover all the other switches, and to
> > +construct the full routing table. It then uses the component API to register
> > +itself as an aggregate driver, with each of the discovered switches as a
> > +component. When ``dsa_register_switch()`` completes for all component switches,
> > +the tree probing continues and calls ``dsa_tree_setup()``.
> 
> An interesting paragraph, but I am not sure this is such a big pain point
> that we should be spending much description of it, especially since it
> sounds like this is solved, but could be improved, but in the grand scheme
> of things, should we really be spending any time on it?
> 
> Same-vendor cascade configurations are Marvell specific, and different
> vendor cascades require distinct switch trees, therefore do not really fall
> into the cross-chip design anymore. In a nutshell, cross-chip is very very
> niche and limited.

Well, I've been contacted by somebody to help with a custom board with 3
daisy chained SJA1105 switches. He is doing the testing for me, and I'm
waiting for the results to come back. I'm currently waiting for an uprev
to an NXP BSP on top of 5.15 to be finalized, so that patches developed
over net-next are at least barely testable...

If you remember, the SJA1105 has these one-shot management routes which
must be installed over SPI, and they decide where the next transmitted
link-local packet goes.

Well, the driver only supports single-chip trees, as you say, because it
only programs the management route in the targeted switch. With daisy
chained switches, one needs to figure out the actual packet route from
the CPU to the leaf user port, and install a management route for every
switch along that route. Otherwise, intermediary switches won't know
what to do with the packets, and drop them.

The specific request was: "help, PTP doesn't work!"

I did solve the problem, and the documentation paragraphs above are
basically my development notes while examining the existing support
and the way in which it isn't giving me the tools I need.

I do need to send a dt-bindings patch on this topic as well. The fact
that we put all cascade links in the device tree means we don't know
which one represents the direct connection to the neighbor cascade port,
and which one is an indirect connection. We need to bake in an
assumption, like for example "the first element of the 'link' phandle
array is the direct connection". I hope retroactively doing that won't
bother the device tree maintainers too much. If it does, the problem is
intractable.

But I agree, requirements for cross-chip support are rare, and with
SJA1105 I don't even own a board where I can directly test it. I
specifically bought the Turris MOX for that.

> > +To untangle this situation and improve the reliability of the cross-chip
> > +management layer, it is necessary to split the DSA operations into ones which
> > +can fail, and ones which cannot fail. For example, ``port_fdb_add()`` can fail,
> > +whereas ``port_fdb_del()`` cannot. Then, cross-chip operations can take a
> > +fallible function to make forward progress, and an infallible function for
> > +rollback. However, it is unclear what to do in the case of ``change_mtu()``.
> > +It is hard to classify this operation as either fallible or infallible. It is
> > +also unclear how to deal with I/O access errors on the switch's management bus.
> 
> How about something like this:
> 
> I/O access errors occurring during the switch configuration should always be
> logged for debugging but are very unlikely to be recoverable and therefore
> require an investigation into the failure mechanism and root cause or
> possible work around.

Yeah, I suppose.

What do you think, has something like phy_error() been a useful mechanism
for anything? Or just a pain in the rear? Would it be useful to shut
everything down on a bus I/O error, phylib style?