[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <08c51309-0bd1-9696-4f4b-4f7425762268@linux.intel.com>
Date: Fri, 6 Mar 2020 10:32:43 -0800
From: Kuppuswamy Sathyanarayanan
<sathyanarayanan.kuppuswamy@...ux.intel.com>
To: Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>
Cc: Marc Zyngier <maz@...nel.org>, x86@...nel.org,
Bjorn Helgaas <bhelgaas@...gle.com>, linux-pci@...r.kernel.org,
Keith Busch <kbusch@...nel.org>
Subject: Re: [patch 7/7] PCI/AER: Fix the broken interrupt injection
On 3/6/20 5:03 AM, Thomas Gleixner wrote:
> The AER error injection mechanism just blindly abuses generic_handle_irq()
> which is really not meant for consumption by random drivers. The include of
> linux/irq.h should have been a red flag in the first place. Driver code,
> unless implementing interrupt chips or low level hypervisor functionality
> has absolutely no business with that.
>
> Invoking generic_handle_irq() from non interrupt handling context can have
> nasty side effects at least on x86 due to the hardware trainwreck which
> makes interrupt affinity changes a fragile beast. Sathyanarayanan triggered
> a NULL pointer dereference in the low level APIC code that way. While the
> particular pointer could be checked this would only paper over the issue
> because there are other ways to trigger warnings or silently corrupt state.
>
> Invoke the new irq_inject_interrupt() mechanism, which has the necessary
> sanity checks in place and injects the interrupt via the irq_retrigger()
> mechanism, which is at least halfways safe vs. the fragile x86 affinity
> change mechanics.
>
> It's safe on x86 as it does not corrupt state, but it still can cause a
> premature completion of an interrupt affinity change causing the interrupt
> line to become stale. Very unlikely, but possible.
>
> For regular operations this is a non issue as AER error injection is meant
> for debugging and testing and not for usage on production systems. People
> using this should better know what they are doing.
It looks good to me.
Reviewed-by: Kuppuswamy Sathyanarayanan
<sathyanarayanan.kuppuswamy@...ux.intel.com>
Tested-by: Kuppuswamy Sathyanarayanan
<sathyanarayanan.kuppuswamy@...ux.intel.com>
>
> Fixes: 390e2db82480 ("PCI/AER: Abstract AER interrupt handling")
> Reported-by: sathyanarayanan.kuppuswamy@...ux.intel.com
> Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
> ---
> drivers/pci/pcie/Kconfig | 1 +
> drivers/pci/pcie/aer_inject.c | 6 ++----
> 2 files changed, 3 insertions(+), 4 deletions(-)
>
> --- a/drivers/pci/pcie/Kconfig
> +++ b/drivers/pci/pcie/Kconfig
> @@ -34,6 +34,7 @@ config PCIEAER
> config PCIEAER_INJECT
> tristate "PCI Express error injection support"
> depends on PCIEAER
> + select GENERIC_IRQ_INJECTION
> help
> This enables PCI Express Root Port Advanced Error Reporting
> (AER) software error injector.
> --- a/drivers/pci/pcie/aer_inject.c
> +++ b/drivers/pci/pcie/aer_inject.c
> @@ -16,7 +16,7 @@
>
> #include <linux/module.h>
> #include <linux/init.h>
> -#include <linux/irq.h>
> +#include <linux/interrupt.h>
> #include <linux/miscdevice.h>
> #include <linux/pci.h>
> #include <linux/slab.h>
> @@ -468,9 +468,7 @@ static int aer_inject(struct aer_error_i
> }
> pci_info(edev->port, "Injecting errors %08x/%08x into device %s\n",
> einj->cor_status, einj->uncor_status, pci_name(dev));
> - local_irq_disable();
> - generic_handle_irq(edev->irq);
> - local_irq_enable();
> + ret = irq_inject_interrupt(edev->irq);
> } else {
> pci_err(rpdev, "AER device not found\n");
> ret = -ENODEV;
>
--
Sathyanarayanan Kuppuswamy
Linux kernel developer
Powered by blists - more mailing lists