[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5472099B.5070105@mellanox.com>
Date:	Sun, 23 Nov 2014 18:21:47 +0200
From:	Amir Vadai <amirv@...lanox.com>
To:	Gavin Shan <gwshan@...ux.vnet.ibm.com>, <netdev@...r.kernel.org>,
	"Or Gerlitz" <ogerlitz@...lanox.com>
CC:	<davem@...emloft.net>, <yishaih@...lanox.com>
Subject: Re: [PATCH] net/mlx4: Fix EEH recovery failure
On 11/22/2014 12:56 PM, Gavin Shan wrote:
> The patch fixes couple of EEH recovery failures on PPC PowerNV
> platform:
> 
>    * Release reserved memory regions in mlx4_pci_err_detected().
>      Otherwise, __mlx4_init_one() fails because of reserving
>      same memory regions recursively.
>    * Disable PCI device in mlx4_pci_err_detected(). Otherwise,
>      pci_enable_device() in __mlx4_init_one() doesn't enable
>      the PCI device because it's already in enabled state indicated
>      by struct pci_dev::enable_cnt.
>    * Don't clear struct mlx4_priv instance in mlx4_pci_err_detected().
>      Otherwise, __mlx4_init_one() runs into kernel crash because
>      of dereferencing to NULL pointer.
> 
> With the patch applied, EEH recovery for mlx4 adapter succeeds on PPC
> PowerNV platform.
> 
>    # lspci
>    0003:0f:00.0 Network controller: Mellanox Technologies \
>    MT27500 Family [ConnectX-3]
> 
> Signed-off-by: Gavin Shan <gwshan@...ux.vnet.ibm.com>
Hi Gavin,
Yishai (added to the CC) is few days before sending a patchset to fix
the reset flow and inside it there is a fix to EEH recovery.
I would be happy if you could wait for the whole reset flow fix by Yishai.
If you'd like, I can send you the patchset to try. Currently it is under
review inside Mellanox before being sent to the mailing list.
Thanks,
Amir
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists
 
