lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DS0PR11MB77854D8F8DEEE0A44BB0E17EF0F22@DS0PR11MB7785.namprd11.prod.outlook.com>
Date: Mon, 10 Feb 2025 11:59:19 +0000
From: "Jagielski, Jedrzej" <jedrzej.jagielski@...el.com>
To: Paul Menzel <pmenzel@...gen.mpg.de>
CC: "intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>,
	"Nguyen, Anthony L" <anthony.l.nguyen@...el.com>, "netdev@...r.kernel.org"
	<netdev@...r.kernel.org>, "horms@...nel.org" <horms@...nel.org>, "Kitszel,
 Przemyslaw" <przemyslaw.kitszel@...el.com>, "Polchlopek, Mateusz"
	<mateusz.polchlopek@...el.com>
Subject: RE: [Intel-wired-lan] [PATCH iwl-next v3] ixgbe: add support for
 thermal sensor event reception

From: Paul Menzel <pmenzel@...gen.mpg.de> 
Sent: Monday, February 10, 2025 12:40 PM
>Dear Jedrzej,
>
>
>Thank you for your patch.
>
>Am 10.02.25 um 11:40 schrieb Jedrzej Jagielski:
>> E610 NICs unlike the previous devices utilising ixgbe driver
>> are notified in the case of overheatning by the FW ACI event.
>
>overheating (without n)
>
>> In event of overheat when threshold is exceeded, FW suspends all
>> traffic and sends overtemp event to the driver.
>
>I guess this was already the behavior before your patch, and now it’s 
>only logged, and the device stopped properly?

Hi Paul,

Without this patch driver cannot receive the overtemp info. FW handles
the event - device stops but user has no clue what has happened.
FW does the major work but this patch adds this minimal piece of
overtemp handling done by SW - informing user at the very end.

>
>> Then driver
>> logs appropriate message and closes the adapter instance.
>> The card remains in that state until the platform is rebooted.
>
>As a user I’d be interested what the threshold is, and what the measured 
>temperature is. Currently, the log seems to be just generic?

These details are FW internals.
Driver just gets info that such event has happened.
There's no additional information.

In that case driver's job is just to inform user that such scenario
has happened and tell what should be the next steps.

>
>     drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:static const char 
>ixgbe_overheat_msg[] = "Network adapter has been stopped because it has 
>over heated. Restart the computer. If the problem persists, power off 
>the system and replace the adapter";
>
>> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@...el.com>
>> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@...el.com>
>> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@...el.com>
>> ---
>> v2,3 : commit msg tweaks
>> ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c      | 5 +++++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_type_e610.h | 3 +++
>>   2 files changed, 8 insertions(+)
>
>
>Kind regards,
>
>Paul
>
>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index 7236f20c9a30..5c804948dd1f 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -3165,6 +3165,7 @@ static void ixgbe_aci_event_cleanup(struct ixgbe_aci_event *event)
>>   static void ixgbe_handle_fw_event(struct ixgbe_adapter *adapter)
>>   {
>>   	struct ixgbe_aci_event event __cleanup(ixgbe_aci_event_cleanup);
>> +	struct net_device *netdev = adapter->netdev;
>>   	struct ixgbe_hw *hw = &adapter->hw;
>>   	bool pending = false;
>>   	int err;
>> @@ -3185,6 +3186,10 @@ static void ixgbe_handle_fw_event(struct ixgbe_adapter *adapter)
>>   		case ixgbe_aci_opc_get_link_status:
>>   			ixgbe_handle_link_status_event(adapter, &event);
>>   			break;
>> +		case ixgbe_aci_opc_temp_tca_event:
>> +			e_crit(drv, "%s\n", ixgbe_overheat_msg);
>> +			ixgbe_close(netdev);
>> +			break;
>>   		default:
>>   			e_warn(hw, "unknown FW async event captured\n");
>>   			break;
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type_e610.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type_e610.h
>> index 8d06ade3c7cd..617e07878e4f 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type_e610.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type_e610.h
>> @@ -171,6 +171,9 @@ enum ixgbe_aci_opc {
>>   	ixgbe_aci_opc_done_alt_write			= 0x0904,
>>   	ixgbe_aci_opc_clear_port_alt_write		= 0x0906,
>>   
>> +	/* TCA Events */
>> +	ixgbe_aci_opc_temp_tca_event                    = 0x0C94,
>> +
>>   	/* debug commands */
>>   	ixgbe_aci_opc_debug_dump_internals		= 0xFF08,
>>   

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ