[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YsyPGMOiIGktUlqD@shell.armlinux.org.uk>
Date: Mon, 11 Jul 2022 21:59:04 +0100
From: "Russell King (Oracle)" <linux@...linux.org.uk>
To: Sean Anderson <sean.anderson@...o.com>
Cc: Heiner Kallweit <hkallweit1@...il.com>, netdev@...r.kernel.org,
Jakub Kicinski <kuba@...nel.org>,
Madalin Bucur <madalin.bucur@....com>,
"David S . Miller" <davem@...emloft.net>,
Paolo Abeni <pabeni@...hat.com>,
Ioana Ciornei <ioana.ciornei@....com>,
linux-kernel@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>,
Andrew Lunn <andrew@...n.ch>,
Frank Rowand <frowand.list@...il.com>,
Rob Herring <robh+dt@...nel.org>,
Saravana Kannan <saravanak@...gle.com>,
devicetree@...r.kernel.org
Subject: Re: [RFC PATCH net-next 3/9] net: pcs: Add helpers for registering
and finding PCSs
Hi Sean,
It's a good attempt and may be nice to have, but I'm afraid the
implementation has a flaw to do with the lifetime of data structures
which always becomes a problem when we have multiple devices being
used in aggregate.
On Mon, Jul 11, 2022 at 12:05:13PM -0400, Sean Anderson wrote:
> +/**
> + * pcs_get_tail() - Finish getting a PCS
> + * @pcs: The PCS to get, or %NULL if one could not be found
> + *
> + * This performs common operations necessary when getting a PCS (chiefly
> + * incrementing reference counts)
> + *
> + * Return: @pcs, or an error pointer on failure
> + */
> +static struct phylink_pcs *pcs_get_tail(struct phylink_pcs *pcs)
> +{
> + if (!pcs)
> + return ERR_PTR(-EPROBE_DEFER);
> +
> + if (!try_module_get(pcs->ops->owner))
> + return ERR_PTR(-ENODEV);
What you're trying to prevent here is the PCS going away - but holding a
reference to the module doesn't prevent that with the driver model. The
driver model design is such that a device can be unbound from its driver
at any moment. Taking a reference to the module doesn't prevent that,
all it does is ensure that the user can't remove the module. It doesn't
mean that the "pcs" structure will remain allocated.
The second issue that this creates is if a MAC driver creates the PCS
and then "gets" it through this interface, then the MAC driver module
ends up being locked in until the MAC driver devices are all unbound,
which isn't friendly at all.
So, anything that proposes to create a new subsystem where we have
multiple devices that make up an aggregate device needs to nicely cope
with any of those devices going away. For that to happen in this
instance, phylink would need to know that its in-use PCS for a
particular MAC is going away, then it could force the link down before
removing all references to the PCS device.
Another solution would be devlinks, but I am really not a fan of that
when there may be a single struct device backing multiple network
interfaces, where some of them may require PCS and others do not. One
wouldn't want the network interface with nfs-root to suddenly go away
because a PCS was unbound from its driver!
> + get_device(pcs->dev);
This helps, but not enough. All it means is the struct device won't
go away, the "pcs" can still go away if the device is unbound from the
driver.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Powered by blists - more mailing lists