linux-kernel - Re: [patch] Make MMCONFIG space (extended PCI config space) a driver opt-in issue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <476F5E22.3060007@garzik.org>
Date:	Mon, 24 Dec 2007 02:22:10 -0500
From:	Jeff Garzik <jeff@...zik.org>
To:	Ivan Kokshaysky <ink@...assic.park.msu.ru>
CC:	Loic Prylli <loic@...i.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Arjan van de Ven <arjan@...radead.org>,
	linux-kernel@...r.kernel.org, gregkh@...e.de,
	linux-pci@...ey.karlin.mff.cuni.cz,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>
Subject: Re: [patch] Make MMCONFIG space (extended PCI config space) a	driver
 opt-in issue

Ivan Kokshaysky wrote:
> On Sun, Dec 23, 2007 at 12:44:30AM -0500, Jeff Garzik wrote:
>> Failures are more predictable and more consistent with an all-or-none 
>> scenario.  The all-or-none solutions are the least complex on the software 
>> side, and far more widely tested than any mixed config access scheme.
> 
> Nope. The vast majority of mmconfig problems that happen at boot time
> actually have nothing to do with "broken" or "working" mmconfig.
> Typically these are
> - mmconf area overlapped during BAR sizing;
> - BIOS (or kernel) placed some MMIO in the middle of mmconfig area,
>   which looks like some random device "doesn't like mmconfig".
> 
> This is a result of all-or-none approach, as mmconfig is fundamentally
> unsafe until after PCI init is done.

The phrase "all or none" specifically describes the current practice in 
arch/x86/pci/mmconfig_{32,64}.c whereby a PCI bus always has one, and 
only one, access method.

So the problems you describe are unrelated to "all or none" as I was 
describing it.

> The mixed access that Loic proposes allows to avoid these boot problems
> just for free.

False.  If you have overlapping areas, then clearly it is 
board-dependent (undefined) what happens -- you might trigger an 
mmconfig access by writel() to your PCI device's MMIO BAR space.

Consider, too, what the current code in arch/x86/pci/mmconfig_{32,64}.c 
does:  it uses the range BIOS told to use for mmconfig, no more no less.

Clearly many early mmconfig BIOSes have buggy resource allocation... 
Thus if mmconfig is not known working on a system, turn it off 100% for 
all buses.  End of story.

> Systems that have both non-mmconf PCI(X) and mmconf PCI-E
> domains could be handled almost for free as well.

The existing code in arch/x86/pci/mmconfig_{32,64}.c today handles 
mixing of PCI-E and PCI-X just fine.  As noted above, its done on a 
per-bus and per-segment basis today.

> And probably most important thing is that the x86 pci_conf implementation
> would be so much simpler and cleaner that I honestly don't understand
> why are we still discussing it ;-)

The proposed API adds code, so I don't see how it simplifies things.

The current approach is quite clean anyway:

1) "raw_pci_ops" points to a single set of PCI config accessors.
2) For mmconfig, if the BIOS did not tell us to use mmconfig with a 
given PCI bus, fall back to type1 accesses.

> At the same time, making drivers to request extended config space
> still makes sense. I think of pci_request_ext_conf(dev, drv_name) which
> doesn't set any per-device flag, but
> - returns success or failure depending on mmconf availability;
> - can be logged by kernel to make it easier to debug if something
>   goes wrong.

I agree with the function as noted in response to Linus's "Sound ok with 
this plan?" email.  But note -- users and developers also need to access 
extended config space, even if driver did not request it.  Even if there 
is no driver at all.

Otherwise the value of "lspci -vvvxxx" debugging reports from users is 
diminished.

	Jeff

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/