[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49152C35.3040501@uniscape.net>
Date: Sat, 08 Nov 2008 14:05:41 +0800
From: Yu Zhao <yu.zhao@...scape.net>
To: Greg KH <greg@...ah.com>
CC: "Zhao, Yu" <yu.zhao@...el.com>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"achiang@...com" <achiang@...com>,
"grundler@...isc-linux.org" <grundler@...isc-linux.org>,
"mingo@...e.hu" <mingo@...e.hu>,
"jbarnes@...tuousgeek.org" <jbarnes@...tuousgeek.org>,
"matthew@....cx" <matthew@....cx>,
"randy.dunlap@...cle.com" <randy.dunlap@...cle.com>,
"rdreier@...co.com" <rdreier@...co.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"virtualization@...ts.linux-foundation.org"
<virtualization@...ts.linux-foundation.org>
Subject: Re: [PATCH 16/16 v6] PCI: document the new PCI boot parameters
Greg KH wrote:
> On Sat, Nov 08, 2008 at 01:00:29PM +0800, Yu Zhao wrote:
>> Greg KH wrote:
>>> On Fri, Nov 07, 2008 at 04:35:47PM +0800, Zhao, Yu wrote:
>>>> Greg KH wrote:
>>>>> On Fri, Nov 07, 2008 at 04:17:02PM +0800, Zhao, Yu wrote:
>>>>>>> Well, to do it "correctly" you are going to have to tell the driver to
>>>>>>> shut itself down, and reinitialize itself.
>>>>>>> Turns out, that doesn't really work for disk and network devices
>>>>>>> without
>>>>>>> dropping the connection (well, network devices should be fine
>>>>>>> probably).
>>>>>>> So you just can't do this, sorry. That's why the BIOS handles all of
>>>>>>> these issues in a PCI hotplug system.
>>>>>>> How does the hardware people think we are going to handle this in the
>>>>>>> OS? It's not something that any operating system can do, is it part
>>>>>>> of
>>>>>>> the IOV PCI spec somewhere?
>>>>>> No, it's not part of the PCI IOV spec.
>>>>>>
>>>>>> I just want the IOV (and whole PCI subsystem) have more flexibility on
>>>>>> various BIOSes. So can we reconsider about resource rebalance as boot
>>>>>> option, or should we forget about this idea?
>>>>> As you have proposed it, the boot option will not work at all, so I
>>>>> think we need to forget about it. Especially if it is not really
>>>>> needed.
>>>> I guess at least one thing would work if people don't want to boot twice:
>>>> give the bus number 0 as rebalance starting point, then all system
>>>> resources would be reshuffled :-)
>>> Hm, but don't we do that today with our basic resource reservation logic
>>> at boot time? What would be different about this kind of proposal?
>> The generic PCI core can do this but this feature is kind of disabled by
>> low level PCI code in x86. The low level code tries to reserve resource
>> according to configuration from BIOS. If the BIOS is wrong, the allocation
>> would fail and the generic PCI core couldn't repair it because the bridge
>> resources may have been allocated by the PCI low level and the PCI core
>> can't expand them to find enough resource for the subordinates.
>
> Yes, we do this on purpose.
>
>> The proposal is to disable x86 PCI low level to allocation resources
>> according to BIOS so PCI core can fully control the resource allocation.
>> The PCI core takes all resources from BARs it knows into account and
>> configure the resource windows on the bridges according to its own
>> calculation.
>
> Ah, so you mean we should revert back to the way we use to do x86 PCI
> resource allocation from about a year and a half ago to about 8 years
> ago?
>
> Hint, there was a reason why we switched over to using the BIOS instead
> of doing it ourselves. Turns out we have to trust the BIOS here, as
> that is exactly what other operating systems do. Trying to do it on our
> own was too fragile and resulted in too many problems over time.
>
> Go look at the archives for when this all was switched, you'll see the
> reasons why.
>
> So no, we will not be going back to the way we used to do things, we
> changed for a reason :)
So it's really a long story, and I'm glad to see the reason.
Actually there was no such thing in early SR-IOV patches, but months ago
I heard some complaints that pushed me to do this kind of reverse. Looks
like I have to let these complaints turn to BIOS people from now on :-)
Regards,
Yu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists