lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <08848763-021c-4f6e-81c2-fb49603ecaa7@intel.com>
Date: Tue, 2 Jul 2024 09:27:44 +0200
From: Przemek Kitszel <przemyslaw.kitszel@...el.com>
To: Petr Machata <petrm@...dia.com>
CC: Ido Schimmel <idosch@...dia.com>, <mlxsw@...dia.com>,
	<linux-pm@...r.kernel.org>, Vadim Pasternak <vadimp@...dia.com>, "David S.
 Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, "Jakub
 Kicinski" <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	<netdev@...r.kernel.org>
Subject: Re: [PATCH net-next 2/3] mlxsw: core_thermal: Report valid current
 state during cooling device registration

On 7/1/24 18:41, Petr Machata wrote:
> From: Ido Schimmel <idosch@...dia.com>
> 
> Commit 31a0fa0019b0 ("thermal/debugfs: Pass cooling device state to
> thermal_debug_cdev_add()") changed the thermal core to read the current
> state of the cooling device as part of the cooling device's
> registration. This is incompatible with the current implementation of
> the cooling device operations in mlxsw, leading to initialization
> failure with errors such as:
> 
> mlxsw_spectrum 0000:01:00.0: Failed to register cooling device
> mlxsw_spectrum 0000:01:00.0: cannot register bus device
> 
> The reason for the failure is that when the get current state operation
> is invoked the driver tries to derive the index of the cooling device by
> walking a per thermal zone array and looking for the matching cooling
> device pointer. However, the pointer is returned from the registration
> function and therefore only set in the array after the registration.
> 
> The issue was later fixed by commit 1af89dedc8a5 ("thermal: core: Do not
> fail cdev registration because of invalid initial state") by not failing
> the registration of the cooling device if it cannot report a valid
> current state during registration, although drivers are responsible for
> ensuring that this will not happen.
> 
> Therefore, make sure the driver is able to report a valid current state
> for the cooling device during registration by passing to the
> registration function a per cooling device private data that already has
> the cooling device index populated.
> 
> Cc: linux-pm@...r.kernel.org
> Reviewed-by: Vadim Pasternak <vadimp@...dia.com>
> Signed-off-by: Ido Schimmel <idosch@...dia.com>
> Signed-off-by: Petr Machata <petrm@...dia.com>
> ---
>   .../ethernet/mellanox/mlxsw/core_thermal.c    | 50 ++++++++++---------
>   1 file changed, 26 insertions(+), 24 deletions(-)
> 

just two nitpicks

> @@ -824,8 +828,8 @@ int mlxsw_thermal_init(struct mlxsw_core *core,
>   err_thermal_zone_device_register:
>   err_thermal_cooling_device_register:
>   	for (i = 0; i < MLXSW_MFCR_PWMS_MAX; i++)
> -		if (thermal->cdevs[i])
> -			thermal_cooling_device_unregister(thermal->cdevs[i]);
> +		if (thermal->cdevs[i].cdev)

this check is done by thermal_cooling_device_unregister()

> +			thermal_cooling_device_unregister(thermal->cdevs[i].cdev);
>   err_reg_write:
>   err_reg_query:
>   	kfree(thermal);
> @@ -848,10 +852,8 @@ void mlxsw_thermal_fini(struct mlxsw_thermal *thermal)
>   	}
>   
>   	for (i = 0; i < MLXSW_MFCR_PWMS_MAX; i++) {
> -		if (thermal->cdevs[i]) {
> -			thermal_cooling_device_unregister(thermal->cdevs[i]);
> -			thermal->cdevs[i] = NULL;
> -		}
> +		if (thermal->cdevs[i].cdev)

ditto

> +			thermal_cooling_device_unregister(thermal->cdevs[i].cdev);
>   	}
>   
>   	kfree(thermal);


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ