[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51007EF2.3090208@jp.fujitsu.com>
Date: Thu, 24 Jan 2013 09:23:14 +0900
From: Takao Indoh <indou.takao@...fujitsu.com>
To: trenn@...e.de
CC: yinghai@...nel.org, muneda.takahiro@...fujitsu.com,
linux-pci@...r.kernel.org, x86@...nel.org,
linux-kernel@...r.kernel.org, andi@...stfloor.org,
tokunaga.keiich@...fujitsu.com, kexec@...ts.infradead.org,
hbabu@...ibm.com, mingo@...hat.com, ddutile@...hat.com,
vgoyal@...hat.com, ishii.hironobu@...fujitsu.com, hpa@...or.com,
bhelgaas@...gle.com, tglx@...utronix.de, khalid@...ehiking.org
Subject: Re: [PATCH v7 0/5] Reset PCIe devices to address DMA problem on kdump
with iommu
(2013/01/23 9:47), Thomas Renninger wrote:
> On Monday, January 21, 2013 10:11:04 AM Takao Indoh wrote:
>> (2013/01/08 4:09), Thomas Renninger wrote:
> ...
>>> I tried the provided patches first on 2.6.32, then I verfied with 3.8-rc2
>>> and in both cases the disk is not detected anymore in
>>> reset_devices (kexec'ed/kdump) case (but things work fine without these
>>> patches).
>>
>> So the problem that the disk is not detected was caused by exactmap
>> problem you guys are discussing? Or still not detected even if exactmap
>> problem is fixed?
> This problem is related to the 5 PCI resetting patches.
> Dumping worked with a 2.6.32 and a 3.8-rc2 kernel, adding the PCI resetting
> patches broke both. I first tried 2.6.32 and verified with 3.8-rc2 to make sure
> I didn't mess up the backport adjustings of the patches to 2.6.32.
If you have a chance please try again the patches with the latest
firmware. I met another problem on megaraid_sas disk when I tested the
patches and it did not occur after updated its firmware to the latest
one.
> Unfortunately this Dell platform takes really long to boot.
> I can give it the one or other test, but please do not bomb me with patches.
>
> For info:
> About the interrupt remapping error interrupt storm in kdump case I tried to
> reproduce on this machine, but never could: The guys who saw that also cannot
> reproduce this anymore.
>
> Two ideas I had about this:
> - As said already, (also) try to catch the error case and try to reset the
> the device in AER/Specific iterrupt remapping error interrupt caught.
> - Have a look at coreboot, these guys should know how to initialize the PCI
> subsystem from scratch and might have some well tested PCI resetting
> code in place already (no idea, just a thought).
Ok, at first I'll take a look at AER code to check how it resets devices
on PCIe error.
Thanks,
Takao Indoh
>
> Thomas
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists