[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ca105394-8168-6896-9c0b-b335de39154e@mellanox.com>
Date: Tue, 17 Apr 2018 10:05:09 +0300
From: Tariq Toukan <tariqt@...lanox.com>
To: David Miller <davem@...emloft.net>, yanjun.zhu@...cle.com
Cc: tariqt@...lanox.com, netdev@...r.kernel.org,
linux-rdma@...r.kernel.org, haakon.bugge@...cle.com
Subject: Re: [PATCH 1/1] net/mlx4_core: avoid resetting HCA when accessing an
offline device
On 16/04/2018 7:51 PM, David Miller wrote:
> From: Zhu Yanjun <yanjun.zhu@...cle.com>
> Date: Sun, 15 Apr 2018 21:02:07 -0400
>
>> While a faulty cable is used or HCA firmware error, HCA device will
>> be offline. When the driver is accessing this offline device, the
>> following call trace will pop out.
> ...
>> In the above call trace, the function mlx4_cmd_poll calls the function
>> mlx4_cmd_post to access the HCA while HCA is offline. Then mlx4_cmd_post
>> returns an error -EIO. Per -EIO, the function mlx4_cmd_poll calls
>> mlx4_cmd_reset_flow to reset HCA. And the above call trace pops out.
>>
>> This is not reasonable. Since HCA device is offline when it is being
>> accessed, it should not be reset again.
>>
>> In this patch, since HCA is offline, the function mlx4_cmd_post returns
>> an error -EINVAL. Per -EINVAL, the function mlx4_cmd_poll directly returns
>> instead of resetting HCA.
>>
>> CC: Srinivas Eeda <srinivas.eeda@...cle.com>
>> CC: Junxiao Bi <junxiao.bi@...cle.com>
>> Suggested-by: HÃ¥kon Bugge <haakon.bugge@...cle.com>
>> Signed-off-by: Zhu Yanjun <yanjun.zhu@...cle.com>
>
> Tariq, I'm assuming you'll take this in and send it to me later.
>
> Thanks.
Yes, I will review and send if all is OK.
Thanks,
Tariq
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Powered by blists - more mailing lists