[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CO1PR11MB508982C614F01E97DA595BA4D6F79@CO1PR11MB5089.namprd11.prod.outlook.com>
Date: Tue, 10 Aug 2021 21:38:02 +0000
From: "Keller, Jacob E" <jacob.e.keller@...el.com>
To: Jakub Kicinski <kuba@...nel.org>, Andrew Lunn <andrew@...n.ch>
CC: Ido Schimmel <idosch@...sch.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
"mkubecek@...e.cz" <mkubecek@...e.cz>,
"pali@...nel.org" <pali@...nel.org>,
"vadimp@...dia.com" <vadimp@...dia.com>,
"mlxsw@...dia.com" <mlxsw@...dia.com>,
Ido Schimmel <idosch@...dia.com>
Subject: RE: [RFC PATCH net-next 1/8] ethtool: Add ability to control
transceiver modules' low power mode
> -----Original Message-----
> From: Jakub Kicinski <kuba@...nel.org>
> Sent: Tuesday, August 10, 2021 7:00 AM
> To: Andrew Lunn <andrew@...n.ch>; Keller, Jacob E <jacob.e.keller@...el.com>
> Cc: Ido Schimmel <idosch@...sch.org>; netdev@...r.kernel.org;
> davem@...emloft.net; mkubecek@...e.cz; pali@...nel.org;
> vadimp@...dia.com; mlxsw@...dia.com; Ido Schimmel <idosch@...dia.com>
> Subject: Re: [RFC PATCH net-next 1/8] ethtool: Add ability to control transceiver
> modules' low power mode
>
> On Tue, 10 Aug 2021 15:52:20 +0200 Andrew Lunn wrote:
> > > The transition from low power to high power can take a few seconds with
> > > QSFP/QSFP-DD and it's likely to only get longer with future / more
> > > complex modules. Therefore, to reduce link-up time, the firmware
> > > automatically transitions modules to high power mode.
> > >
> > > There is obviously a trade-off here between power consumption and
> > > link-up time. My understanding is that Mellanox is not the only vendor
> > > favoring shorter link-up times as users have the ability to control the
> > > low power mode of the modules in other implementations.
> > >
> > > Regarding "why do we need user space involved?", by default, it does not
> > > need to be involved (the system works without this API), but if it wants
> > > to reduce the power consumption by setting unused modules to low power
> > > mode, then it will need to use this API.
> >
> > O.K. Thanks for the better explanation. Some of this should go into
> > the commit message.
> >
> > I suggest it gets a different name and semantics, to avoid
> > confusion. I think we should consider this the default power mode for
> > when the link is administratively down, rather than direct control
> > over the modules power mode. The driver should transition the module
> > to this setting on link down, be it high power or low power. That
> > saves a lot of complexity, since i assume you currently need a udev
> > script or something which sets it to low power mode on link down,
> > where as you can avoid this be configuring the default and let the
> > driver do it.
>
> Good point. And actually NICs have similar knobs, exposed via ethtool
> priv flags today. Intel NICs for example. Maybe we should create a
> "really power the port down policy" API?
>
> Jake do you know what the use cases for Intel are? Are they SFP, MAC,
> or NC-SI related?
Offhand I don't know. I think we have some requirements documents I can look up. I'll try to get back to you soon if I can find any further information. (Yes, I wish the commit messages gave stronger motivation too...)
Thanks,
Jake
>
> > I also wonder if a hierarchy is needed? You can set the default for
> > the switch, and then override is per module? I _guess_ most users will
> > decide at a switch level they want to save power and pay the penalty
> > over longer link up times. But then we have the question, is it an
> > ethtool option, or a devlink parameter?
Powered by blists - more mailing lists