[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YRIqOZrrjS0HOppg@shredder>
Date: Tue, 10 Aug 2021 10:26:49 +0300
From: Ido Schimmel <idosch@...sch.org>
To: Andrew Lunn <andrew@...n.ch>
Cc: netdev@...r.kernel.org, davem@...emloft.net, kuba@...nel.org,
mkubecek@...e.cz, pali@...nel.org, vadimp@...dia.com,
mlxsw@...dia.com, Ido Schimmel <idosch@...dia.com>
Subject: Re: [RFC PATCH net-next 1/8] ethtool: Add ability to control
transceiver modules' low power mode
On Mon, Aug 09, 2021 at 04:28:32PM +0200, Andrew Lunn wrote:
> On Mon, Aug 09, 2021 at 01:21:45PM +0300, Ido Schimmel wrote:
> > From: Ido Schimmel <idosch@...dia.com>
> >
> > Add a pair of new ethtool messages, 'ETHTOOL_MSG_MODULE_SET' and
> > 'ETHTOOL_MSG_MODULE_GET', that can be used to control transceiver
> > modules parameters and retrieve their status.
>
> Hi Ido
>
> I've not read all the patchset yet, but i like the general direction.
>
> > The first parameter to control is the low power mode of the module. It
> > is only relevant for paged memory modules, as flat memory modules always
> > operate in low power mode.
> >
> > When a paged memory module is in low power mode, its power consumption
> > is reduced to the minimum, the management interface towards the host is
> > available and the data path is deactivated.
> >
> > User space can choose to put modules that are not currently in use in
> > low power mode and transition them to high power mode before putting the
> > associated ports administratively up.
> >
> > Transitioning into low power mode means loss of carrier, so error is
> > returned when the netdev is administratively up.
>
> However, i don't get this use case. With copper PHYs, putting the link
> administratively down results in a call into phylib and into the
> driver to down the link. This effectively puts the PHY into a low
> power mode. The management interface, as defined by C22 and C45 remain
> available, but the data path is disabled. For a 1G PHY, this can save
> a few watts.
>
> For SFPs managed by phylink and the kernal SFP driver, the exact same
> happens. The TX_ENABLE pin of the SFP is set to false. The I2C bus
> still works, but the data path is disabled.
>
> So i would expect a driver using firmware, not Linux code to manage
> SFPs, to just do this on link down. Why do we need user space
> involved?
The transition from low power to high power can take a few seconds with
QSFP/QSFP-DD and it's likely to only get longer with future / more
complex modules. Therefore, to reduce link-up time, the firmware
automatically transitions modules to high power mode.
There is obviously a trade-off here between power consumption and
link-up time. My understanding is that Mellanox is not the only vendor
favoring shorter link-up times as users have the ability to control the
low power mode of the modules in other implementations.
Regarding "why do we need user space involved?", by default, it does not
need to be involved (the system works without this API), but if it wants
to reduce the power consumption by setting unused modules to low power
mode, then it will need to use this API.
Powered by blists - more mailing lists