linux-kernel - Re: [PATCH v2] PCI: Reset PCIe devices to stop ongoing DMA

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <51F5B545.5050300@jp.fujitsu.com>
Date:	Mon, 29 Jul 2013 09:20:21 +0900
From:	Takao Indoh <indou.takao@...fujitsu.com>
To:	vgoyal@...hat.com
CC:	bhelgaas@...gle.com, linux-kernel@...r.kernel.org,
	linux-pci@...r.kernel.org, iommu@...ts.linux-foundation.org,
	kexec@...ts.infradead.org, ishii.hironobu@...fujitsu.com,
	ddutile@...hat.com, bill.sumner@...com, alex.williamson@...hat.com,
	hbabu@...ibm.com
Subject: Re: [PATCH v2] PCI: Reset PCIe devices to stop ongoing DMA

(2013/07/25 23:24), Vivek Goyal wrote:
> On Wed, Jul 24, 2013 at 03:29:58PM +0900, Takao Indoh wrote:
>> Sorry for letting this discussion slide, I was busy on other works:-(
>> Anyway, the summary of previous discussion is:
>> - My patch adds new initcall(fs_initcall) to reset all PCIe endpoints on
>>    boot. This expects PCI enumeration is done before IOMMU
>>    initialization as follows.
>>      (1) PCI enumeration
>>      (2) fs_initcall ---> device reset
>>      (3) IOMMU initialization
>> - This works on x86, but does not work on other architecture because
>>    IOMMU is initialized before PCI enumeration on some architectures. So,
>>    device reset should be done where IOMMU is initialized instead of
>>    initcall.
>> - Or, as another idea, we can reset devices in first kernel(panic kernel)
>>
>> Resetting devices in panic kernel is against kdump policy and seems not to
>> be good idea. So I think adding reset code into iommu initialization is
>> better. I'll post patches for that.
> 
> I don't understand all the details but I agree that idea of trying to
> reset IOMMU in crashed kernel might not fly.
> 
>>
>> Another discussion point is how to handle buggy devices. Resetting buggy
>> devices makes system more unstable. One of ideas is using boot parameter
>> so that user can choose to reset devices or not.
> 
> So who would decide which device is buggy and don't reset it. Give
> some details here.

I found the case that kdump does not work after resetting devices and
it works when removing reset patch. The cause of problem is a bug of
PCIe switch chip. If there is boot parameter not to reset devices,
user can use it as workaround.

I think in this case we should add PCI quirk to avoid this buggy
hardware, but we need to wait errata from vendor and it basically takes
long time.

> 
> Can't we simply blacklist associated module, so that it never loads
> and then it never tries to reset the devices?
> 

So you mean that device reset should be done on its driver loading?

Thanks,
Takao Indoh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/