[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20180416.125110.1875435797136179428.davem@davemloft.net>
Date: Mon, 16 Apr 2018 12:51:10 -0400 (EDT)
From: David Miller <davem@...emloft.net>
To: yanjun.zhu@...cle.com
Cc: tariqt@...lanox.com, netdev@...r.kernel.org,
linux-rdma@...r.kernel.org, haakon.bugge@...cle.com
Subject: Re: [PATCH 1/1] net/mlx4_core: avoid resetting HCA when accessing
an offline device
From: Zhu Yanjun <yanjun.zhu@...cle.com>
Date: Sun, 15 Apr 2018 21:02:07 -0400
> While a faulty cable is used or HCA firmware error, HCA device will
> be offline. When the driver is accessing this offline device, the
> following call trace will pop out.
...
> In the above call trace, the function mlx4_cmd_poll calls the function
> mlx4_cmd_post to access the HCA while HCA is offline. Then mlx4_cmd_post
> returns an error -EIO. Per -EIO, the function mlx4_cmd_poll calls
> mlx4_cmd_reset_flow to reset HCA. And the above call trace pops out.
>
> This is not reasonable. Since HCA device is offline when it is being
> accessed, it should not be reset again.
>
> In this patch, since HCA is offline, the function mlx4_cmd_post returns
> an error -EINVAL. Per -EINVAL, the function mlx4_cmd_poll directly returns
> instead of resetting HCA.
>
> CC: Srinivas Eeda <srinivas.eeda@...cle.com>
> CC: Junxiao Bi <junxiao.bi@...cle.com>
> Suggested-by: HÃ¥kon Bugge <haakon.bugge@...cle.com>
> Signed-off-by: Zhu Yanjun <yanjun.zhu@...cle.com>
Tariq, I'm assuming you'll take this in and send it to me later.
Thanks.
Powered by blists - more mailing lists