[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101206092749.7f89f3fd@jbarnes-desktop>
Date: Mon, 6 Dec 2010 09:27:49 -0800
From: Jesse Barnes <jbarnes@...tuousgeek.org>
To: Suresh Siddha <suresh.b.siddha@...el.com>
Cc: tglx@...utronix.de, mingo@...hat.com, hpa@...or.com,
linux-kernel@...r.kernel.org,
Kenji Kaneshige <kaneshige.kenji@...fujitsu.com>,
Chris Wright <chrisw@...s-sol.org>,
Max Asbock <masbock@...ux.vnet.ibm.com>,
indou.takao@...fujitsu.com, Bjorn Helgaas <bjorn.helgaas@...com>,
David Woodhouse <dwmw2@...radead.org>, stable@...nel.org
Subject: Re: [patch 1/4] vt-d: quirk for masking vtd spec errors to platform
error handling logic
On Tue, 30 Nov 2010 22:22:26 -0800
Suresh Siddha <suresh.b.siddha@...el.com> wrote:
> On platforms with Intel 7500 chipset, there were some reports of system
> hang/NMI's during kexec/kdump in the presence of interrupt-remapping enabled.
>
> During kdump, there is a window where the devices might be still using old
> kernel's interrupt information, while the kdump kernel is coming up. This can
> cause vt-d faults as the interrupt configuration from the old kernel map to
> null IRTE entries in the new kernel etc. (with out interrupt-remapping enabled,
> we still have the same issue but in this case we will see benign spurious
> interrupt hit the new kernel).
>
> Based on platform config settings, these platforms seem to generate NMI/SMI
> when a vt-d fault happens and there were reports that the resulting SMI causes
> the system to hang.
>
> Fix it by masking vt-d spec defined errors to platform error reporting logic.
> VT-d spec related errors are already handled by the VT-d OS code, so need to
> report the same erorr through other channels.
>
> Signed-off-by: Suresh Siddha <suresh.b.siddha@...el.com>
> Cc: stable@...nel.org [v2.6.32+]
> ---
> drivers/pci/quirks.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> Index: tip/drivers/pci/quirks.c
> ===================================================================
> --- tip.orig/drivers/pci/quirks.c
> +++ tip/drivers/pci/quirks.c
> @@ -2764,6 +2764,26 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_RI
> DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_RICOH, PCI_DEVICE_ID_RICOH_R5C832, ricoh_mmc_fixup_r5c832);
> #endif /*CONFIG_MMC_RICOH_MMC*/
>
> +#if defined(CONFIG_DMAR) || defined(CONFIG_INTR_REMAP)
> +/*
> + * This is a quirk for masking vt-d spec defined errors to platform error
> + * handling logic. With out this, platforms seem to generate NMI/SMI (based
> + * on the RAS config settings of the platform) when a vt-d fault happens and
> + * there were reports that the resulting SMI causes system to hang.
> + *
> + * VT-d spec related errors are already handled by the VT-d OS code, so no
> + * need to report the same erorr through other channels.
> + */
> +static void vtd_mask_spec_errors(struct pci_dev *dev)
> +{
> + u32 word;
> +
> + pci_read_config_dword(dev, 0x1AC, &word);
> + pci_write_config_dword(dev, 0x1AC, word | (1 << 31));
> +}
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x342e, vtd_mask_spec_errors);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x3c28, vtd_mask_spec_errors);
> +#endif
>
> static void pci_do_fixups(struct pci_dev *dev, struct pci_fixup *f,
> struct pci_fixup *end)
Can we make these registers and bits a bit more self-documenting (i.e.
#defines for both, maybe along with other useful bit definitions for
this reg)? Also, "error" is misspelled as "erorr" above. :)
--
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists