lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <98767929-b401-402b-8e6b-d997cf27bfb0@lunn.ch>
Date:   Sat, 11 Mar 2023 16:19:43 +0100
From:   Andrew Lunn <andrew@...n.ch>
To:     Klaus Kudielka <klaus.kudielka@...il.com>
Cc:     Florian Fainelli <f.fainelli@...il.com>,
        Vladimir Oltean <olteanv@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Richard Cochran <richardcochran@...il.com>,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] net: dsa: mv88e6xxx: move call to
 mv88e6xxx_mdios_register()

On Sat, Mar 11, 2023 at 10:41:41AM +0100, Klaus Kudielka wrote:
> >From commit 1a136ca2e089 ("net: mdio: scan bus based on bus capabilities
> for C22 and C45") onwards, mdiobus_scan_bus_c45() is being called on buses
> with MDIOBUS_NO_CAP. On a Turris Omnia (Armada 385, 88E6176 switch), this
> causes a significant increase of boot time, from 1.6 seconds, to 6.3
> seconds. The boot time stated here is until start of /init.
> 
> Further testing revealed that the C45 scan is indeed expensive (around
> 2.7 seconds, due to a huge number of bus transactions), and called twice.
> 
> It was suggested, to call mv88e6xxx_mdios_register() at the beginning of
> mv88e6xxx_setup(), and mv88e6xxx_mdios_unregister() at the end of
> mv88e6xxx_teardown(). This is accomplished by this patch.
> 
> Testing on the Turris Omnia revealed, that this improves the situation.
> Now mdiobus_scan_bus_c45() is called only once, ending up in a boot time
> of 4.3 seconds.

For those who are interested, here is a bit of background on why this
change reduces the number of bus scans.

The MAC driver probes, which i think in this case is mvneta. Part way
through its probe, it registers its MDIO bus. That triggers a scan of
its bus, and the switch is found. The mv88e6xxx driver is then loaded
and its probe function called. towards the end of the mv88e6xxxx probe
function, it registers its MDIO bus. That causes a scan of the
switches MDIO bus. Which is slow. After the scan completes, the
mv88e6xxx probe continues, and registers the switch with DSA core. The
core then parses the DT binding for the switch and looks for the
master ethernet interface. That is the interface which mvneta
provides. But mvneta is still only part way through its probe. It has
not yet registered its interface with the netdev core. So the DSA core
fails to find it and return EPROBE_DEFER. This causes the mv88e6xxx
driver to unwind its probe. The mvneat then gets a chance to finish
its probe and register its netdev. Some timer later, the driver core
runs the probes again for those drivers which returned EPROBE_DEFER,
mv88e6xxx registers its MDIO bus again, another scan is performed, the
switch is registered with the code, and this time the master device is
available, so things continue. The DSA core then calls the drivers
.setup() callback to get the switch into a usable state.

I think what remains in the probe function is cheap, so it can
probable stay there and be done twice. But it might be worth putting
in a few printks to get some time stamps and see if anything is
expensive.

>  static int mv88e6xxx_setup(struct dsa_switch *ds)
> @@ -3889,6 +3892,10 @@ static int mv88e6xxx_setup(struct dsa_switch *ds)
>  	int err;
>  	int i;
>  
> +	err = mv88e6xxx_mdios_register(chip);
> +	if (err)
> +		return err;
> +
>  	chip->ds = ds;
>  	ds->slave_mii_bus = mv88e6xxx_default_mdio_bus(chip);

Other calls in mv88e6xxx_setup() can fail, so you need to extend the
cleanup to remove the mdio bus on failure.

	Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ