[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5472099B.5070105@mellanox.com>
Date: Sun, 23 Nov 2014 18:21:47 +0200
From: Amir Vadai <amirv@...lanox.com>
To: Gavin Shan <gwshan@...ux.vnet.ibm.com>, <netdev@...r.kernel.org>,
"Or Gerlitz" <ogerlitz@...lanox.com>
CC: <davem@...emloft.net>, <yishaih@...lanox.com>
Subject: Re: [PATCH] net/mlx4: Fix EEH recovery failure
On 11/22/2014 12:56 PM, Gavin Shan wrote:
> The patch fixes couple of EEH recovery failures on PPC PowerNV
> platform:
>
> * Release reserved memory regions in mlx4_pci_err_detected().
> Otherwise, __mlx4_init_one() fails because of reserving
> same memory regions recursively.
> * Disable PCI device in mlx4_pci_err_detected(). Otherwise,
> pci_enable_device() in __mlx4_init_one() doesn't enable
> the PCI device because it's already in enabled state indicated
> by struct pci_dev::enable_cnt.
> * Don't clear struct mlx4_priv instance in mlx4_pci_err_detected().
> Otherwise, __mlx4_init_one() runs into kernel crash because
> of dereferencing to NULL pointer.
>
> With the patch applied, EEH recovery for mlx4 adapter succeeds on PPC
> PowerNV platform.
>
> # lspci
> 0003:0f:00.0 Network controller: Mellanox Technologies \
> MT27500 Family [ConnectX-3]
>
> Signed-off-by: Gavin Shan <gwshan@...ux.vnet.ibm.com>
Hi Gavin,
Yishai (added to the CC) is few days before sending a patchset to fix
the reset flow and inside it there is a fix to EEH recovery.
I would be happy if you could wait for the whole reset flow fix by Yishai.
If you'd like, I can send you the patchset to try. Currently it is under
review inside Mellanox before being sent to the mailing list.
Thanks,
Amir
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists