lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b246beff-17ce-f21a-b874-55d6b04dbf07@mellanox.com>
Date:   Mon, 6 May 2019 10:44:08 +0000
From:   Moshe Shemesh <moshe@...lanox.com>
To:     Jiri Pirko <jiri@...nulli.us>, Saeed Mahameed <saeedm@...lanox.com>
CC:     "David S. Miller" <davem@...emloft.net>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Jiri Pirko <jiri@...lanox.com>,
        Feras Daoud <ferasda@...lanox.com>,
        Alex Vesker <valex@...lanox.com>,
        Daniel Jurgens <danielj@...lanox.com>
Subject: Re: [net-next 07/15] net/mlx5: Issue SW reset on FW assert



On 5/5/2019 6:38 PM, Jiri Pirko wrote:
> Sun, May 05, 2019 at 02:33:18AM CEST, saeedm@...lanox.com wrote:
>> From: Feras Daoud <ferasda@...lanox.com>
>>
>> If a FW assert is considered fatal, indicated by a new bit in the health
>> buffer, reset the FW. After the reset go through the normal recovery
>> flow. Only one PF needs to issue the reset, so an attempt is made to
>> prevent the 2nd function from also issuing the reset.
>> It's not an error if that happens, it just slows recovery.
>>
>> Signed-off-by: Feras Daoud <ferasda@...lanox.com>
>> Signed-off-by: Alex Vesker <valex@...lanox.com>
>> Signed-off-by: Moshe Shemesh <moshe@...lanox.com>
>> Signed-off-by: Daniel Jurgens <danielj@...lanox.com>
>> Signed-off-by: Saeed Mahameed <saeedm@...lanox.com>
>> ---
>> .../ethernet/mellanox/mlx5/core/diag/crdump.c |  13 +-
>> .../net/ethernet/mellanox/mlx5/core/health.c  | 157 +++++++++++++++++-
>> .../net/ethernet/mellanox/mlx5/core/main.c    |   1 +
>> .../ethernet/mellanox/mlx5/core/mlx5_core.h   |   2 +
>> include/linux/mlx5/device.h                   |  10 +-
>> include/linux/mlx5/driver.h                   |   1 +
>> 6 files changed, 176 insertions(+), 8 deletions(-)
>>
> 
> [...]
> 
> 
>> +void mlx5_error_sw_reset(struct mlx5_core_dev *dev)
>> +{
>> +	unsigned long end, delay_ms = MLX5_FW_RESET_WAIT_MS;
>> +	int lock = -EBUSY;
>> +
>> +	mutex_lock(&dev->intf_state_mutex);
>> +	if (dev->state != MLX5_DEVICE_STATE_INTERNAL_ERROR)
>> +		goto unlock;
>> +
>> +	mlx5_core_err(dev, "start\n");
> 
> Leftover?
> 
Not leftover, it was just moved from one point to another.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ