lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 5 Sep 2021 13:31:25 +0300
From:   Vladimir Oltean <olteanv@...il.com>
To:     Leon Romanovsky <leon@...nel.org>
Cc:     Vladimir Oltean <vladimir.oltean@....com>, netdev@...r.kernel.org,
        Andrew Lunn <andrew@...n.ch>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Florian Fainelli <f.fainelli@...il.com>
Subject: Re: [RFC PATCH net] net: dsa: tear down devlink port regions when
 tearing down the devlink port on error

On Sun, Sep 05, 2021 at 01:25:03PM +0300, Leon Romanovsky wrote:
> On Sun, Sep 05, 2021 at 11:45:18AM +0300, Vladimir Oltean wrote:
> > On Sun, Sep 05, 2021 at 10:07:45AM +0300, Leon Romanovsky wrote:
> > > On Fri, Sep 03, 2021 at 02:17:38AM +0300, Vladimir Oltean wrote:
> > > > Commit 86f8b1c01a0a ("net: dsa: Do not make user port errors fatal")
> > > > decided it was fine to ignore errors on certain ports that fail to
> > > > probe, and go on with the ports that do probe fine.
> > > > 
> > > > Commit fb6ec87f7229 ("net: dsa: Fix type was not set for devlink port")
> > > > noticed that devlink_port_type_eth_set(dlp, dp->slave); does not get
> > > > called, and devlink notices after a timeout of 3700 seconds and prints a
> > > > WARN_ON. So it went ahead to unregister the devlink port. And because
> > > > there exists an UNUSED port flavour, we actually re-register the devlink
> > > > port as UNUSED.
> > > > 
> > > > Commit 08156ba430b4 ("net: dsa: Add devlink port regions support to
> > > > DSA") added devlink port regions, which are set up by the driver and not
> > > > by DSA.
> > > > 
> > > > When we trigger the devlink port deregistration and reregistration as
> > > > unused, devlink now prints another WARN_ON, from here:
> > > > 
> > > > devlink_port_unregister:
> > > > 	WARN_ON(!list_empty(&devlink_port->region_list));
> > > > 
> > > > So the port still has regions, which makes sense, because they were set
> > > > up by the driver, and the driver doesn't know we're unregistering the
> > > > devlink port.
> > > > 
> > > > Somebody needs to tear them down, and optionally (actually it would be
> > > > nice, to be consistent) set them up again for the new devlink port.
> > > > 
> > > > But DSA's layering stays in our way quite badly here.
> > > 
> > > I don't know anything about DSA
> > 
> > It is sufficient to know in this case that it is a multi-port networking
> > driver.
> > 
> > > and what led to the decision to ignore devlink registration errors,
> > 
> > But we are not ignoring devlink registration errors...
> > 
> > The devlink_port must be initialized prior to initializing the net_device.
> > 
> > Initializing a certain net_device may fail due to reasons such as "PHY
> > not found". It is desirable in certain cases for a net_device
> > initialization failure to not fail the entire switch probe.
> > 
> > So at the very least, rollback of the registration of that port must be
> > performed before continuing => the devlink_port needs to be unregistered
> > when the net_device initialization has failed.
> > 
> > > but devlink core is relying on the simple assumption that everything
> > > is initialized correctly.
> > > 
> > > So if DSA needs to have not-initialized port, it should do all the needed
> > > hacks internally.
> > 
> > So the current problem is that the DSA framework does not ask the hardware
> > driver whether it has devlink port regions which need to be torn down
> > before unregistering the devlink port.
> > 
> > I was expecting the feedback to be "we need to introduce new methods in
> > struct dsa_switch_ops which do .port_setup and .port_teardown, similar
> > to the already existing per-switch .setup and .teardown, and drivers
> > which set up devlink port regions should set these up from the port
> > methods, so that DSA can simply call those when it needs to tear down a
> > devlink port without tearing down the entire switch and devlink instance".
> > The proposed patch is horrible and I agree, but not for the reasons you
> > might think it is.
> > 
> > Either way, "all the needed hacks" are already done internally, and from
> > devlink's perspective everything is initialized correctly, not sure what
> > this comment is about. I am really not changing anything in DSA's
> > interaction with the devlink core, other than ensuring we do not
> > unregister a devlink port with regions on it.
> 
> That sentence means that your change is OK and you did it right by not
> changing devlink port to hold not-working ports.

You're with me so far.

There is a second part. The ports with 'status = "disabled"' in the
device tree still get devlink ports registered, but with the
DEVLINK_PORT_FLAVOUR_UNUSED flavour and no netdev. These devlink ports
still have things like port regions exported.

What we do for ports that have failed to probe is to reinit their
devlink ports as DEVLINK_PORT_FLAVOUR_UNUSED, and their port regions, so
they effectively behave as though they were disabled in the device tree.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ