[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAErSpo72_JnUkWXrzbNdbX9wiWZtQdX2tuNjEpYrSd+sGzxfzA@mail.gmail.com>
Date: Tue, 28 Feb 2012 16:31:04 -0700
From: Bjorn Helgaas <bhelgaas@...gle.com>
To: Yinghai Lu <yinghai@...nel.org>
Cc: Jesse Barnes <jbarnes@...tuousgeek.org>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Tony Luck <tony.luck@...el.com>,
Dominik Brodowski <linux@...inikbrodowski.net>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-arch@...r.kernel.org, Paul Mackerras <paulus@...ba.org>,
linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH 08/18] PCI, powerpc: Register busn_res for root buses
On Mon, Feb 27, 2012 at 10:36 PM, Bjorn Helgaas <bhelgaas@...gle.com> wrote:
> On Mon, Feb 27, 2012 at 7:09 PM, Yinghai Lu <yinghai@...nel.org> wrote:
>> Signed-off-by: Yinghai Lu <yinghai@...nel.org>
>> Cc: Benjamin Herrenschmidt <benh@...nel.crashing.org>
>> Cc: Paul Mackerras <paulus@...ba.org>
>> Cc: linuxppc-dev@...ts.ozlabs.org
>> ---
>> arch/powerpc/kernel/pci-common.c | 7 ++++++-
>> 1 files changed, 6 insertions(+), 1 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
>> index 910b9de..ae5ae5f 100644
>> --- a/arch/powerpc/kernel/pci-common.c
>> +++ b/arch/powerpc/kernel/pci-common.c
>> @@ -1660,6 +1660,8 @@ void __devinit pcibios_scan_phb(struct pci_controller *hose)
>> bus->secondary = hose->first_busno;
>> hose->bus = bus;
>>
>> + pci_bus_insert_busn_res(bus, hose->first_busno, hose->last_busno);
>> +
>> /* Get probe mode and perform scan */
>> mode = PCI_PROBE_NORMAL;
>> if (node && ppc_md.pci_probe_mode)
>> @@ -1670,8 +1672,11 @@ void __devinit pcibios_scan_phb(struct pci_controller *hose)
>> of_scan_bus(node, bus);
>> }
>>
>> - if (mode == PCI_PROBE_NORMAL)
>> + if (mode == PCI_PROBE_NORMAL) {
>> + pci_bus_update_busn_res_end(bus, 255);
>> hose->last_busno = bus->subordinate = pci_scan_child_bus(bus);
>> + pci_bus_update_busn_res_end(bus, bus->subordinate);
>> + }
>
> There's a lot of powerpc code that does this:
>
> bus_range = of_get_property(pcictrl, "bus-range", &len);
> hose->first_busno = bus_range[0];
> hose->last_busno = bus_range[1];
>
> That *looks* like it is discovering the bus number aperture. Is it?
> If it is, why are we using the largest bus number found by
> pci_scan_child_bus() rather than "last_busno"?
Sorry, I missed the earlier hunk of the patch where you *do* use last_busno:
>> + pci_bus_insert_busn_res(bus, hose->first_busno, hose->last_busno);
I still think this part is wrong:
+ if (mode == PCI_PROBE_NORMAL) {
+ pci_bus_update_busn_res_end(bus, 255);
hose->last_busno = bus->subordinate = pci_scan_child_bus(bus);
+ pci_bus_update_busn_res_end(bus, bus->subordinate);
I think there are two problems:
1) We can enumerate devices under the wrong PHB. Assume this:
PCI host bridge A to [bus 00]
pci 0000:00:01.0: PCI bridge
PCI host bridge B to [bus 01]
pci 0000:01:01.0: PCI endpoint
The P2P bridge at 00:01.0 has no devices below it, but of course we
can't tell that until we look behind it. To do that, we'll have to
assign a bus number, and since we forced the bus number aperture to
[bus 00-ff] instead of the correct [bus 00], we'll probably allocate
bus number 01 as the secondary bus. Then we'll generate a config
cycle for 01:01.0, which discovers a device. But we can't tell that
the cycle was actually claimed by host bridge B, not A. So now we
wrongly think that 01:01.0 is under A, so we can't handle its
resources correctly.
I think we should have failed when allocating a secondary bus number
for 00:01.0 and just skipped looking behind it.
2) We preclude hot-add in some cases. For example, if we scan this topology:
PCI host bridge C to [bus 00-7f]
pci 0000:00:01.0: PCI bridge to [bus 01]
pci 0000:01:01.0: PCI endpoint
we set the root bus's subordinate bus number to 01 (the highest bus
number we discovered), so we now think host bridge C leads only to
[bus 00-01]. Now let's remove 01:01.0 and plug in a card with a
bridge on it, e.g.,
pci 0000:01:01.0: PCI bridge to ...
pci 0000:xx:01.0: PCI endpoint
We can't allocate a bus number for 01:01.0's secondary bus because we
think we're out of space. But we're really not; the true bus number
aperture for C is [bus 00-7f], not [bus 00-01].
We may need mechanism to say "don't trust this info from the
firmware," but we should be able to figure out a way that doesn't
penalize platforms that do everything correctly. The current patch
breaks these scenarios even when the platform firmware is 100%
correct.
Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists