lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Wed, 12 Nov 2008 14:11:21 -0800
From:	Gary Hade <garyhade@...ibm.com>
To:	"GARCIA DE SORIA LUCENA, JUAN JESUS" <juanj.g_soria@...pobbva.com>
Cc:	Gary Hade <garyhade@...ibm.com>, Jiri Slaby <jirislaby@...il.com>,
	linux-kernel@...r.kernel.org
Subject: Re: Regression: Boot hang sizing transparent PCI-to-PCI
	bridgesince after 2.6.25-r7.

On Wed, Nov 12, 2008 at 09:50:14AM +0100, GARCIA DE SORIA LUCENA, JUAN JESUS wrote:
> > -----Original Message-----
> > From: Gary Hade [mailto:garyhade@...ibm.com] 
> > 
> > Correction.  We were not trying to address boot-time hangs 
> > but I believe we may have been trying to address expansion 
> > ROM related PCI resource allocation failures that we were 
> > seeing during boot with certain PCI cards.  However, I doubt 
> > that this is relevent to your problem.  The transparent 
> > bridge sizing removal change is probably a red herring.
> 
> After reading a little bit on how PCI and PCI-to-PCI bridges work, and
> how they're handled in linux (http://tldp.org/LDP/tlk/dd/pci.html), I
> now know that ranges in the bridge (either I/O, mmio or prefetchable
> mem) are simply disabled when start < end, and that's the original
> configuration that the BIOS enforces when the bridge is not sized by
> Linux.
> 
> After inserting kprintf()'s, I see that the hang happens actually while
> the positive decoding of the I/O range in the bridge is being activated
> in pci_setup_bridge(), sometime in between the writes to the I/O
> base/limit registers of the bridge; I don't remember exactly which was
> the last pci_Write_config_dword() that allowed the next kprintf() to
> succeed. I'll look at it tonight again, but I suspect that the final
> enabling write (the one that updates the PCI_IO_BASE_UPPER16 register
> with its final value) was the one hanging the machine.
> 
> The I/O range being activated was the one in the range 0x1000-0x1fff,
> apparently correctly sized to accomodate the two I/O ranges
> (0x1000-0x10ff, 0x1400-0x14ff) assigned to the CardBus bridge on the
> secondary bus.
> 
> One theory is that my system has something actually mapped to that I/O
> range in the root PCI bus. When only subtractive decoding is in place
> (the I/O range isn't activated), access to the secondary bus behind the
> PCI-to-PCI bridge is done when the transaction isn't claimed by any
> device in the root bus, after what the PCI docs describe as a 4-cycle
> timeout. When the I/O range is activated, that range is positively
> decoded by the bridge, which tries to claim the transaction before the
> timeout. Perhaps two devices (the bridge and the unknown device on the
> root bus) conflict when claiming the same transaction?
> 
> Another possibility could be that activating the I/O range disables the
> negative decoding in the secondary-to-primary sense of the bridge for
> that I/O range. Perhaps some device behind the bridge depends on being
> able to forward transactions to the primary bus on that I/O range, but
> it's disallowed after the range is configured. For me this seems rather
> unlikely, because of the nature of the devices behind the bridge.
> 
> I'll look at it more closely again, and I will test whether commenting
> out the I/O range sizing (leaving the other ranges to be sized) is
> enough to allow the system to run. If so, is there any way to use a
> system-specific quirk in order to remove the PCI-to-PCI bridge I/O range
> from being sized/activated?

Yes, there are already examples of this sort of thing in the PCI core.
See drivers/pci/quirks.c.  Of course, you will likely have to present
a very good argument as to why such a change is really necessary.

BTW, I just noticed that you have not been copying linux-pci@...r.kernel.org
or the PCI maintainer Jesse Barnes <jbarnes@...tuousgeek.org> where you
will probably get more seasoned advise.

Gary

-- 
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503  IBM T/L: 775-4503
garyhade@...ibm.com
http://www.ibm.com/linux/ltc

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ