linux-kernel - RE: Regression: Boot hang sizing transparent PCI-to-PCI bridgesince after 2.6.25-r7.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <D913A682D10DBA4097B5D068D281B09B0266FB4D@S1128EXE.bbva.igrupobbva>
Date:	Wed, 12 Nov 2008 09:50:14 +0100
From:	"GARCIA DE SORIA LUCENA, JUAN JESUS" <juanj.g_soria@...pobbva.com>
To:	"Gary Hade" <garyhade@...ibm.com>
Cc:	"Jiri Slaby" <jirislaby@...il.com>, <linux-kernel@...r.kernel.org>
Subject: RE: Regression: Boot hang sizing transparent PCI-to-PCI bridgesince after 2.6.25-r7.

> -----Original Message-----
> From: Gary Hade [mailto:garyhade@...ibm.com] 
> 
> Correction.  We were not trying to address boot-time hangs 
> but I believe we may have been trying to address expansion 
> ROM related PCI resource allocation failures that we were 
> seeing during boot with certain PCI cards.  However, I doubt 
> that this is relevent to your problem.  The transparent 
> bridge sizing removal change is probably a red herring.

After reading a little bit on how PCI and PCI-to-PCI bridges work, and
how they're handled in linux (http://tldp.org/LDP/tlk/dd/pci.html), I
now know that ranges in the bridge (either I/O, mmio or prefetchable
mem) are simply disabled when start < end, and that's the original
configuration that the BIOS enforces when the bridge is not sized by
Linux.

After inserting kprintf()'s, I see that the hang happens actually while
the positive decoding of the I/O range in the bridge is being activated
in pci_setup_bridge(), sometime in between the writes to the I/O
base/limit registers of the bridge; I don't remember exactly which was
the last pci_Write_config_dword() that allowed the next kprintf() to
succeed. I'll look at it tonight again, but I suspect that the final
enabling write (the one that updates the PCI_IO_BASE_UPPER16 register
with its final value) was the one hanging the machine.

The I/O range being activated was the one in the range 0x1000-0x1fff,
apparently correctly sized to accomodate the two I/O ranges
(0x1000-0x10ff, 0x1400-0x14ff) assigned to the CardBus bridge on the
secondary bus.

One theory is that my system has something actually mapped to that I/O
range in the root PCI bus. When only subtractive decoding is in place
(the I/O range isn't activated), access to the secondary bus behind the
PCI-to-PCI bridge is done when the transaction isn't claimed by any
device in the root bus, after what the PCI docs describe as a 4-cycle
timeout. When the I/O range is activated, that range is positively
decoded by the bridge, which tries to claim the transaction before the
timeout. Perhaps two devices (the bridge and the unknown device on the
root bus) conflict when claiming the same transaction?

Another possibility could be that activating the I/O range disables the
negative decoding in the secondary-to-primary sense of the bridge for
that I/O range. Perhaps some device behind the bridge depends on being
able to forward transactions to the primary bus on that I/O range, but
it's disallowed after the range is configured. For me this seems rather
unlikely, because of the nature of the devices behind the bridge.

I'll look at it more closely again, and I will test whether commenting
out the I/O range sizing (leaving the other ranges to be sized) is
enough to allow the system to run. If so, is there any way to use a
system-specific quirk in order to remove the PCI-to-PCI bridge I/O range
from being sized/activated?

Best regards,

   Juan Jesus.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/