lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1412057394-7186-1-git-send-email-zhen-hual@hp.com>
Date:	Tue, 30 Sep 2014 14:09:54 +0800
From:	"Li, Zhen-Hua" <zhen-hual@...com>
To:	<linux-kernel@...r.kernel.org>,
	Bjorn Helgaas <bhelgaas@...gle.com>,
	<linux-pci@...r.kernel.org>
Cc:	Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
	Jesse Brandeburg <jesse.brandeburg@...el.com>,
	Bruce Allan <bruce.w.allan@...el.com>,
	Carolyn Wyborny <carolyn.wyborny@...el.com>,
	Don Skidmore <donald.c.skidmore@...el.com>,
	Greg Rose <gregory.v.rose@...el.com>,
	Alex Duyck <alexander.h.duyck@...el.com>,
	John Ronciak <john.ronciak@...el.com>,
	Mitch Williams <mitch.a.williams@...el.com>,
	Linux NICS <linux.nics@...el.com>,
	<e1000-devel@...ts.sourceforge.net>, <linda.knippers@...com>,
	"Li, Zhen-Hua" <zhen-hual@...com>
Subject: [PATCH 1/1] pci/quirks: fix a dmar fault for intel 82599 card

On a HP system with Intel Corporation 82599 ethernet adapter, when kernel
crashed and the kdump kernel boots with intel_iommu=on, there may be some
unexpected DMA requests on this adapter, which will cause DMA Remapping
faults like:
    dmar: DRHD: handling fault status reg 102
    dmar: DMAR:[DMA Read] Request device [41:00.0] fault addr fff81000
    DMAR:[fault reason 01] Present bit in root entry is clear

Analysis for this bug:

The present bit is set in this function:

static struct context_entry * device_to_context_entry(
		struct intel_iommu *iommu, u8 bus, u8 devfn)
{
    ......
                set_root_present(root);
    ......
}

Calling tree:
ixgbe_open
    ixgbe_setup_tx_resources
        intel_alloc_coherent
            __intel_map_single
                domain_context_mapping
                    domain_context_mapping_one
                        device_to_context_entry

This means, the present bit in root entry will not be set until the device
 driver is loaded.

But in the kdump kernel, some hardware device does not know the OS is the
 second kernel and the drivers should be loaded again, this causes there are
 some unexpected DMA requsts on this device when it has not been initialized,
 and then the DMA Remapping errors come.

To fix this DMAR fault, we need to reset the bus that this device on. Reset
 the device itself does not work.

There also was a discussion:
https://lkml.org/lkml/2013/5/14/9

Signed-off-by: Li, Zhen-Hua <zhen-hual@...com>
---
 drivers/pci/quirks.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 80c2d01..5198af3 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -25,6 +25,7 @@
 #include <linux/sched.h>
 #include <linux/ktime.h>
 #include <asm/dma.h>	/* isa_dma_bridge_buggy */
+#include <linux/crash_dump.h>
 #include "pci.h"
 
 /*
@@ -3832,3 +3833,13 @@ void pci_dev_specific_enable_acs(struct pci_dev *dev)
 		}
 	}
 }
+
+#ifdef CONFIG_CRASH_DUMP
+void quirk_reset_buggy_devices(struct pci_dev *dev)
+{
+	if (unlikely(is_kdump_kernel()))
+		pci_try_reset_bus(dev->bus);
+}
+DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_VENDOR_ID_INTEL, 0x10f8,
+		PCI_CLASS_NETWORK_ETHERNET, 8, quirk_reset_buggy_devices);
+#endif
-- 
2.0.0-rc0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ