[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e6101601-3fb1-7551-3027-1701bda0fa33@redhat.com>
Date: Wed, 25 Mar 2020 01:34:19 +0530
From: Bhupesh Sharma <bhsharma@...hat.com>
To: "Guilherme G. Piccoli" <gpiccoli@...onical.com>,
netanel@...zon.com, akiyano@...zon.com, netdev@...r.kernel.org
Cc: gtzalik@...zon.com, saeedb@...zon.com, zorik@...zon.com,
kernel@...ccoli.net, gshan@...hat.com, gavin.guo@...onical.com,
jay.vosburgh@...onical.com, pedro.principeza@...onical.com,
"kexec@...ts.infradead.org" <kexec@...ts.infradead.org>,
bhsharma@...hat.com
Subject: Re: [PATCH] net: ena: Add PCI shutdown handler to allow safe kexec
Hi Guilherme,
On 03/20/2020 06:25 PM, Guilherme G. Piccoli wrote:
> Currently ENA only provides the PCI remove() handler, used during rmmod
> for example. This is not called on shutdown/kexec path; we are potentially
> creating a failure scenario on kexec:
>
> (a) Kexec is triggered, no shutdown() / remove() handler is called for ENA;
> instead pci_device_shutdown() clears the master bit of the PCI device,
> stopping all DMA transactions;
>
> (b) Kexec reboot happens and the device gets enabled again, likely having
> its FW with that DMA transaction buffered; then it may trigger the (now
> invalid) memory operation in the new kernel, corrupting kernel memory area.
>
> This patch aims to prevent this, by implementing a shutdown() handler
> quite similar to the remove() one - the difference being the handling
> of the netdev, which is unregistered on remove(), but following the
> convention observed in other drivers, it's only detached on shutdown().
>
> This prevents an odd issue in AWS Nitro instances, in which after the 2nd
> kexec the next one will fail with an initrd corruption, caused by a wild
> DMA write to invalid kernel memory. The lspci output for the adapter
> present in my instance is:
>
> 00:05.0 Ethernet controller [0200]: Amazon.com, Inc. Elastic Network
> Adapter (ENA) [1d0f:ec20]
Thanks for the patch.
> Suggested-by: Gavin Shan <gshan@...hat.com>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@...onical.com>
> ---
>
>
> The idea for this patch came from an informal conversation with my
> friend Gavin Shan, based on his past experience with similar issues.
> I'd like to thank him for the great suggestion!
>
> As a test metric, I've performed 1000 kexecs with this patch, whereas
> without this one, the 3rd kexec failed with initrd corruption. Also,
> one test that I've done before writing the patch was just to rmmod
> the driver before the kexecs, and it worked fine too.
>
> I suggest we add this patch in stable releases as well.
> Thanks in advance for reviews,
This patch fixes the repetitive kexec reboot issues that I was facing
for some time on the aws nitro (t3) machines. Normally the kexec reboots
would not runs more than ~ 3 to 5 times on the machine.
Now with this patch, I can runs hundreds of repetitive nested kexec
reboots on the aws nitro machines without any failure.
So, I think this is a really good patch and should be applied to stable
trees as well.
Please feel free to add:
Tested-and-Reviewed-by: Bhupesh Sharma <bhsharma@...hat.com>
Thanks,
Bhupesh
> Guilherme
>
>
> drivers/net/ethernet/amazon/ena/ena_netdev.c | 51 ++++++++++++++++----
> 1 file changed, 41 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> index 0b2fd96b93d7..7a5c01ff2ee8 100644
> --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
> +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> @@ -4325,13 +4325,15 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>
> /*****************************************************************************/
>
> -/* ena_remove - Device Removal Routine
> +/* __ena_shutoff - Helper used in both PCI remove/shutdown routines
> * @pdev: PCI device information struct
> + * @shutdown: Is it a shutdown operation? If false, means it is a removal
> *
> - * ena_remove is called by the PCI subsystem to alert the driver
> - * that it should release a PCI device.
> + * __ena_shutoff is a helper routine that does the real work on shutdown and
> + * removal paths; the difference between those paths is with regards to whether
> + * dettach or unregister the netdevice.
> */
> -static void ena_remove(struct pci_dev *pdev)
> +static void __ena_shutoff(struct pci_dev *pdev, bool shutdown)
> {
> struct ena_adapter *adapter = pci_get_drvdata(pdev);
> struct ena_com_dev *ena_dev;
> @@ -4350,13 +4352,17 @@ static void ena_remove(struct pci_dev *pdev)
>
> cancel_work_sync(&adapter->reset_task);
>
> - rtnl_lock();
> + rtnl_lock(); /* lock released inside the below if-else block */
> ena_destroy_device(adapter, true);
> - rtnl_unlock();
> -
> - unregister_netdev(netdev);
> -
> - free_netdev(netdev);
> + if (shutdown) {
> + netif_device_detach(netdev);
> + dev_close(netdev);
> + rtnl_unlock();
> + } else {
> + rtnl_unlock();
> + unregister_netdev(netdev);
> + free_netdev(netdev);
> + }
>
> ena_com_rss_destroy(ena_dev);
>
> @@ -4371,6 +4377,30 @@ static void ena_remove(struct pci_dev *pdev)
> vfree(ena_dev);
> }
>
> +/* ena_remove - Device Removal Routine
> + * @pdev: PCI device information struct
> + *
> + * ena_remove is called by the PCI subsystem to alert the driver
> + * that it should release a PCI device.
> + */
> +
> +static void ena_remove(struct pci_dev *pdev)
> +{
> + __ena_shutoff(pdev, false);
> +}
> +
> +/* ena_shutdown - Device Shutdown Routine
> + * @pdev: PCI device information struct
> + *
> + * ena_shutdown is called by the PCI subsystem to alert the driver that
> + * a shutdown/reboot (or kexec) is happening and device must be disabled.
> + */
> +
> +static void ena_shutdown(struct pci_dev *pdev)
> +{
> + __ena_shutoff(pdev, true);
> +}
> +
> #ifdef CONFIG_PM
> /* ena_suspend - PM suspend callback
> * @pdev: PCI device information struct
> @@ -4420,6 +4450,7 @@ static struct pci_driver ena_pci_driver = {
> .id_table = ena_pci_tbl,
> .probe = ena_probe,
> .remove = ena_remove,
> + .shutdown = ena_shutdown,
> #ifdef CONFIG_PM
> .suspend = ena_suspend,
> .resume = ena_resume,
>
Powered by blists - more mailing lists