lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 04 Mar 2013 17:00:12 -0500
From:	Don Dutile <ddutile@...hat.com>
To:	Takao Indoh <indou.takao@...fujitsu.com>
CC:	trenn@...e.de, yinghai@...nel.org, muneda.takahiro@...fujitsu.com,
	linux-pci@...r.kernel.org, x86@...nel.org,
	linux-kernel@...r.kernel.org, andi@...stfloor.org,
	tokunaga.keiich@...fujitsu.com, kexec@...ts.infradead.org,
	hbabu@...ibm.com, mingo@...hat.com, vgoyal@...hat.com,
	ishii.hironobu@...fujitsu.com, hpa@...or.com, bhelgaas@...gle.com,
	tglx@...utronix.de, khalid@...ehiking.org
Subject: Re: [PATCH v7 0/5] Reset PCIe devices to address DMA problem on kdump
 with iommu

On 03/03/2013 07:56 PM, Takao Indoh wrote:
> (2013/01/23 9:47), Thomas Renninger wrote:
>> On Monday, January 21, 2013 10:11:04 AM Takao Indoh wrote:
>>> (2013/01/08 4:09), Thomas Renninger wrote:
>> ...
>>>> I tried the provided patches first on 2.6.32, then I verfied with 3.8-rc2
>>>> and in both cases the disk is not detected anymore in
>>>> reset_devices (kexec'ed/kdump) case (but things work fine without these
>>>> patches).
>>>
>>> So the problem that the disk is not detected was caused by exactmap
>>> problem you guys are discussing? Or still not detected even if exactmap
>>> problem is fixed?
>> This problem is related to the 5 PCI resetting patches.
>> Dumping worked with a 2.6.32 and a 3.8-rc2 kernel, adding the PCI resetting
>> patches broke both. I first tried 2.6.32 and verified with 3.8-rc2 to make sure
>> I didn't mess up the backport adjustings of the patches to 2.6.32.
>>
>> Unfortunately this Dell platform takes really long to boot.
>> I can give it the one or other test, but please do not bomb me with patches.
>>
>> For info:
>> About the interrupt remapping error interrupt storm in kdump case I tried to
>> reproduce on this machine, but never could: The guys who saw that also cannot
>> reproduce this anymore.
>>
>> Two ideas I had about this:
>>     - As said already, (also) try to catch the error case and try to reset the
>>       the device in AER/Specific iterrupt remapping error interrupt caught.
>
> I tried this idea but it did not work on megaraid_sas.
>
> I made a experimental patch so that devices are reset when DMAR error is
> detected on it. What happened is that:
> 1) megaraid_sas module is loaded.
> 2) DMAR error is detected during the driver initialization.
This driver does something bad that IOMMU code isn't designed for,
or handle correctly -- it starts with one dma-mask, does an IOMMU mapping,
changes its dma-mask, and that moves it into another domain that's not
valid for the first mask.... and does occassional access with original mask.
I have it on my to-do list to dig into the driver more to see if that
sequence can be changed/fixed.

> 3) Reset device
> 4) kdump fails because the disk is not found.
>
> When I tested patches which reset all devices in early boot time, the
> disk was recognized correctly, so it seems that device reset during its
> driver loading does something wrong. I think we need reset device at
driver rest, or master-enable turned off ?

> least before its driver is loaded.
>
> Thanks,
> Takao Indoh
>
>
>>     - Have a look at coreboot, these guys should know how to initialize the PCI
>>       subsystem from scratch and might have some well tested PCI resetting
>>       code in place already (no idea, just a thought).
>>
>>       Thomas
>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ