[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <545AA576-42A5-47A7-A08A-062582B1569A@cisco.com>
Date: Thu, 15 Jul 2021 18:12:25 +0000
From: "Billie Alsup (balsup)" <balsup@...co.com>
To: Bjorn Helgaas <helgaas@...nel.org>
CC: Paul Menzel <pmenzel@...gen.mpg.de>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Guohan Lu <lguohan@...il.com>,
"Madhava Reddy Siddareddygari (msiddare)" <msiddare@...co.com>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Sergey Miroshnichenko <s.miroshnichenko@...ro.com>
Subject: Re: [RFC][PATCH] PCI: Reserve address space for powered-off devices
behind PCIe bridges
It took me a while to figure out that the "New Outlook" option doesn't actually allow sending plain text, so I have to switch to "Old Outlook" mode.
It is not clear as to what parameters Linux would use to consider a window broken. But if the kernel preserves some bridge window assignment, then it seems feasible for our BIOS to run this same algorithm (reading PLX persistent scratch registers to determine window sizes). I will raise this possibility with our own kernel team to discuss with the bios team. We can also look more closely at the resource_alignment options to see if that might suffice. Thanks for the information!
From: Bjorn Helgaas <helgaas@...nel.org>
Date: Thursday, July 15, 2021 at 10:14 AM
To: "Billie Alsup (balsup)" <balsup@...co.com>
Cc: Paul Menzel <pmenzel@...gen.mpg.de>, Bjorn Helgaas <bhelgaas@...gle.com>, Guohan Lu <lguohan@...il.com>, "Madhava Reddy Siddareddygari (msiddare)" <msiddare@...co.com>, "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "David S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>, Sergey Miroshnichenko <s.miroshnichenko@...ro.com>
Subject: Re: [RFC][PATCH] PCI: Reserve address space for powered-off devices behind PCIe bridges
On Thu, Jul 15, 2021 at 04:52:26PM +0000, Billie Alsup (balsup) wrote:
We are aware of how Cisco device specific this code is, and hadn't
intended to upstream it. This code was originally written for an
older kernel version (4.8.28-WR9.0.0.26_cgl). I am not the original
author; I just ported it into various SONiC linux kernels. We use
ACPI with SONiC (although not on our non-SONiC products), so I
thought I might be able to define such windows within the ACPI tree
and have some generic code to read such configuration information
from the ACPI tables,. However, initial attempts failed so I went
with the existing approach. I believe we did look at the hpmmiosize
parameter, but iirc it applied to each bridge, rather than being a
pool of address space to dynamically parcel out as necessary.
Right. I mentioned "pci=resource_alignment=" because it claims to be
able to specify window sizes for specific bridges. But I haven't
exercised that myself.
There are multiple bridges involved in the hardware (there are 8
hot-plug fabric cards, each with multiple PCI devices). Devices on
the card are in multiple power zones, so all devices are not
immediately visible to the pci scanning code. The top level bridge
reserves close to 5G. The 2nd level (towards the fabric cards)
reserve 4.5G. The 3rd level has 9 bridges each reserving 512M. The
4th level reserves 384M (with a 512M alignment restriction iirc).
The 5th level reserves 384M (again with an alignment restriction).
That defines the bridge hierarchy visible at boot. Things behind
that 5th level are hot-plugged where there are two more bridge
levels and 5 devices (1 requiring 2x64M blocks and 4 requiring
1x64M).
I'm not sure if the Cisco kernel team has revisited the hpmmiosize
and resource_alignment parameters since this initial implementation.
Reading the description of Sergey's patches, he seems to be
approaching the same problem from a different direction. It is
unclear if such an approach is practical for our environment. It
would require updates to all of our SONiC drivers to support
stopping/remapping/restarting, and it is unclear if that is
acceptable. It is certainly less preferable to pre-reserving the
required space. For our embedded product, we know exactly what
devices will be plugged in, and allowing that to be pre-programmed
into the PLX eeprom gives us the flexibility we need.
If you know up front what devices are possible and how much space they
need, possibly your firmware could assign the bridge windows you need.
Linux generally does not change window assignments unless they are
broken.
Bjorn
Powered by blists - more mailing lists