[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210905084518.emlagw76qmo44rpw@skbuf>
Date: Sun, 5 Sep 2021 11:45:18 +0300
From: Vladimir Oltean <olteanv@...il.com>
To: Leon Romanovsky <leon@...nel.org>
Cc: Vladimir Oltean <vladimir.oltean@....com>, netdev@...r.kernel.org,
Andrew Lunn <andrew@...n.ch>,
Vivien Didelot <vivien.didelot@...il.com>,
Florian Fainelli <f.fainelli@...il.com>
Subject: Re: [RFC PATCH net] net: dsa: tear down devlink port regions when
tearing down the devlink port on error
On Sun, Sep 05, 2021 at 10:07:45AM +0300, Leon Romanovsky wrote:
> On Fri, Sep 03, 2021 at 02:17:38AM +0300, Vladimir Oltean wrote:
> > Commit 86f8b1c01a0a ("net: dsa: Do not make user port errors fatal")
> > decided it was fine to ignore errors on certain ports that fail to
> > probe, and go on with the ports that do probe fine.
> >
> > Commit fb6ec87f7229 ("net: dsa: Fix type was not set for devlink port")
> > noticed that devlink_port_type_eth_set(dlp, dp->slave); does not get
> > called, and devlink notices after a timeout of 3700 seconds and prints a
> > WARN_ON. So it went ahead to unregister the devlink port. And because
> > there exists an UNUSED port flavour, we actually re-register the devlink
> > port as UNUSED.
> >
> > Commit 08156ba430b4 ("net: dsa: Add devlink port regions support to
> > DSA") added devlink port regions, which are set up by the driver and not
> > by DSA.
> >
> > When we trigger the devlink port deregistration and reregistration as
> > unused, devlink now prints another WARN_ON, from here:
> >
> > devlink_port_unregister:
> > WARN_ON(!list_empty(&devlink_port->region_list));
> >
> > So the port still has regions, which makes sense, because they were set
> > up by the driver, and the driver doesn't know we're unregistering the
> > devlink port.
> >
> > Somebody needs to tear them down, and optionally (actually it would be
> > nice, to be consistent) set them up again for the new devlink port.
> >
> > But DSA's layering stays in our way quite badly here.
>
> I don't know anything about DSA
It is sufficient to know in this case that it is a multi-port networking
driver.
> and what led to the decision to ignore devlink registration errors,
But we are not ignoring devlink registration errors...
The devlink_port must be initialized prior to initializing the net_device.
Initializing a certain net_device may fail due to reasons such as "PHY
not found". It is desirable in certain cases for a net_device
initialization failure to not fail the entire switch probe.
So at the very least, rollback of the registration of that port must be
performed before continuing => the devlink_port needs to be unregistered
when the net_device initialization has failed.
> but devlink core is relying on the simple assumption that everything
> is initialized correctly.
>
> So if DSA needs to have not-initialized port, it should do all the needed
> hacks internally.
So the current problem is that the DSA framework does not ask the hardware
driver whether it has devlink port regions which need to be torn down
before unregistering the devlink port.
I was expecting the feedback to be "we need to introduce new methods in
struct dsa_switch_ops which do .port_setup and .port_teardown, similar
to the already existing per-switch .setup and .teardown, and drivers
which set up devlink port regions should set these up from the port
methods, so that DSA can simply call those when it needs to tear down a
devlink port without tearing down the entire switch and devlink instance".
The proposed patch is horrible and I agree, but not for the reasons you
might think it is.
Either way, "all the needed hacks" are already done internally, and from
devlink's perspective everything is initialized correctly, not sure what
this comment is about. I am really not changing anything in DSA's
interaction with the devlink core, other than ensuring we do not
unregister a devlink port with regions on it.
Powered by blists - more mailing lists