[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0b81ee70-efe3-4a06-b115-1a56e007b9a7@lunn.ch>
Date: Thu, 6 Feb 2025 14:59:08 +0100
From: Andrew Lunn <andrew@...n.ch>
To: "Jagielski, Jedrzej" <jedrzej.jagielski@...el.com>
Cc: "intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>,
"Nguyen, Anthony L" <anthony.l.nguyen@...el.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"Kitszel, Przemyslaw" <przemyslaw.kitszel@...el.com>
Subject: Re: [PATCH iwl-next v2] ixgbe: add support for thermal sensor event
reception
On Thu, Feb 06, 2025 at 01:05:27PM +0000, Jagielski, Jedrzej wrote:
> From: Andrew Lunn <andrew@...n.ch>
> Sent: Tuesday, February 4, 2025 2:09 PM
> >On Tue, Feb 04, 2025 at 08:17:00AM +0100, Jedrzej Jagielski wrote:
> >> E610 NICs unlike the previous devices utilising ixgbe driver
> >> are notified in the case of overheatning by the FW ACI event.
> >>
> >> In event of overheat when treshold is exceeded, FW suspends all
> >> traffic and sends overtemp event to the driver. Then driver
> >> logs appropriate message and closes the adapter instance.
> >> The card remains in that state until the platform is rebooted.
> >
> >There is also an HWMON temp[1-*]_emergency_alarm you can set. I
> >_think_ that should also cause a udev event, so user space knows the
> >print^h^h^h^h^hnetwork is on fire.
> >
> > Andrew
>
> I am not sure whether HWMON is applicable in that case.
> Driver receives an async notification from the FW that an overheating
> occurred, so has to handle it. In that case - by printing msg
> and making the interface disabled for the user.
> FW is responsible for monitoring temperature itself.
> There's even no possibility to read temperature by the driver
https://elixir.bootlin.com/linux/v6.13.1/source/drivers/net/ethernet/intel/ixgbe/ixgbe_sysfs.c#L27
ixgbe_hwmon_show_temp() is some other temperature sensor? Which you do
have HWMON support for?
Or is the E610 not really an ixgbe, it has a different architecture,
more stuff pushed into firmware, less visibility from the kernel, no
temperature monitoring, just a NIC on fire indication?
Andrew
Powered by blists - more mailing lists