lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4F95B17B.3030401@redhat.com>
Date:	Mon, 23 Apr 2012 15:46:03 -0400
From:	Don Dutile <ddutile@...hat.com>
To:	Richard Yang <weiyang@...ux.vnet.ibm.com>
CC:	linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: One problem in reassign pci bus number?

On 04/22/2012 11:52 AM, Richard Yang wrote:
> All,
>
> I am reading the pci_scan_bridge() and not sure what will happen in
> following situation.
>
> Suppose the kernel is not passed the pci=assign-busses.
>
> Below is a picture about the pci system.
>
>                     +-------+
>                     |       | root bridge(0,255)
>                     +---+---+
>                         |          Bus 0
>        -----+-----------+------------------------------+--
>             |                                          |
>             |                                          |
>             |                                          |
>        +----+----+                               +-----+-----+
>        |         |  B1(1,15)                     |           |B2(16,28)
>        +----+----+                               +-----+-----+
>             |  Bus 1                                   |    Bus 16
>        -----+-----------------------         ----------+----------------
>                              |
>                         +----+----+
>                         |         | B3
>                         +---------+
>
> Suppose B1 and B2 works fine with the BIOS, which get the right bus
> number and range.
>
> B3 does not works fine with the BIOS, which doesn't get the bus number.
>
> So in pci_scan_bridge(), B3 will be met in the second pass and get bus
> number 16?

unfortunately, today, the answer is yes.
I have run into a similar problem recently when trying to use pci=assign-busses
with an SRIOV device behind a non-ARI-capable PCIe switch.
In this scenario, the assign-busses code assigned the next bus number,
which conflicted with an existing one on the system, and hangs the
system -- two bridges responding to the same PCI bus num evidently
confuses the hw! ;-)

The PCI code is suppose to do two bus scans -- pass=0: to see what the BIOS
has setup, and then pass=1 to assign non-BIOS setup devices.
But, what I'm finding is that when pci=assign-busses is set, the
pass=0 scan is not doing a full PCI tree scan and registering all
the BIOS-setup busses first, and it tries to do extended bus assignment in pass=0,
not pass=1; in the above configuration, it expands B1's bus num range from (1,15)
to (1,16), then tries to scan behind it.  that creates an overlap btwn
B1 & B2's sec/sub bus-num ranges, and they both respond to a Type1 config cycle
with a bus-number of 16 (typically when trying to read the VID register of 16:0.0
in this case).... boom! ... or more like silence due to system hang...

*If* the system spaces bus ranges apart, e.g., in your config above,
if the BIOS setup B1(1,15) and B2(24,32), then pci=assign-busses will
work because bus num 16 is free, and two bridges won't think they both
respond to type1 pci config cycle (with bus-number=16 lying in their sec/sub-bus num range),
and all will (luckily) work.

Unfortunately, I'm in & out of work due to at-home time requirements,
so I haven't had a chance to work out a proper patch.
What should happen in the above case, is the kernel prints a warning saying
it couldn't do needed assign-busses operations due to configuration constraints...
and continue to do pci (pass=1) bridge scanning.... and not wedge the system
as it does now.
The base problem is that
(a)pass=0 is doing bus-assigning, and it shouldn't be done
     until pass=1, after all known BIOS-setup busses are known
(b) the code doesn't have a nice warning and continuation when this
     conflict occurs.

> Would this be a conflict?
>
summary: yes.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ