[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241024214753.GA1005373@bhelgaas>
Date: Thu, 24 Oct 2024 16:47:53 -0500
From: Bjorn Helgaas <helgaas@...nel.org>
To: Mario Limonciello <mario.limonciello@....com>
Cc: Yazen Ghannam <yazen.ghannam@....com>, linux-edac@...r.kernel.org,
linux-kernel@...r.kernel.org, tony.luck@...el.com, x86@...nel.org,
avadhut.naik@....com, john.allen@....com, bhelgaas@...gle.com,
Shyam-sundar.S-k@....com, richard.gong@....com, jdelvare@...e.com,
linux@...ck-us.net, clemens@...isch.de, hdegoede@...hat.com,
ilpo.jarvinen@...ux.intel.com, linux-pci@...r.kernel.org,
linux-hwmon@...r.kernel.org, platform-driver-x86@...r.kernel.org,
naveenkrishna.chatradhi@....com, carlos.bilbao.osdev@...il.com
Subject: Re: [PATCH 00/16] AMD NB and SMN rework
On Thu, Oct 24, 2024 at 04:20:35PM -0500, Mario Limonciello wrote:
> On 10/24/2024 16:06, Bjorn Helgaas wrote:
> > On Thu, Oct 24, 2024 at 03:08:41PM -0500, Mario Limonciello wrote:
> > > On 10/24/2024 12:46, Bjorn Helgaas wrote:
> > > > On Thu, Oct 24, 2024 at 12:01:59PM -0400, Yazen Ghannam wrote:
> > > > > On Wed, Oct 23, 2024 at 12:59:28PM -0500, Bjorn Helgaas wrote:
> > > > > > On Wed, Oct 23, 2024 at 05:21:34PM +0000, Yazen Ghannam wrote:
> > > ...
> >
> > > > > > The use of pci_get_slot() and pci_get_domain_bus_and_slot() is not
> > > > > > ideal since all those pci_get_*() interfaces are kind of ugly in my
> > > > > > opinion, and using them means we have to encode topology details in
> > > > > > the kernel. But this still seems like a big improvement.
> > > > >
> > > > > Thanks for the feedback. Hopefully, we'll come to some improved
> > > > > solution. :)
> > > > >
> > > > > Can you please elaborate on your concern? Is it about saying "thing X is
> > > > > always at SBDF A:B:C.D" or something else?
> > > >
> > > > "Thing X is always at SBDF A:B:C.D" is one big reason. "A:B:C.D" says
> > > > nothing about the actual functionality of the device. A PCI
> > > > Vendor/Device ID or a PNP ID identifies the device programming model
> > > > independent of its geographical location. Inferring the functionality
> > > > and programming model from the location is a maintenance issue because
> > > > hardware may change the address.
> > > >
> > > > PCI bus numbers are under software control, so in general it's not
> > > > safe to rely on them, although in this case these devices are probably
> > > > on root buses where the bus number is either fixed or determined by
> > > > BIOS configuration of the host bridge.
> > > >
> > > > I don't like the pci_get_*() functions because they break the driver
> > > > model. The usual .probe() model binds a device to a driver, which
> > > > essentially means the driver owns the device and its resources, and
> > > > the driver and doesn't have to worry about other code interfering.
> > >
> > > Are you suggesting that perhaps we should be introducing amd_smn (patch 10)
> > > as a PCI driver that binds "to the root device" instead?
> >
> > I don't know any of the specifics, so I can't really opine on that.
> >
> > The PCI specs envision that a Vendor/Device ID defines the programming
> > model of the device, and you would only use a new Device ID when that
> > programming model changes.
> >
> > Of course, vendors like to define a new set of Device IDs for every
> > new chipset even when no driver changes are required, so even if a new
> > SMN works exactly the same as in previous chipsets, you're probably
> > back to having to add a new Device ID for every new chipset.
>
> Yeah; this I believe is why we're here today and trying to find something
> more manageable (IE this series).
Another alternative would be an ACPI device where you can use the same
_HID (or at least a _CID) for all the chipsets.
> > The Subsystem Vendor ID and Subsystem ID exist to solve a similar
> > problem (sort of in reverse). If AMD could allocate a Subsystem ID
> > for this SMN programming model and use that same ID in every chipset,
> > you could make a pci_driver.id_table entry that would match them all,
> > e.g.,
> >
> > .vendor = PCI_VENDOR_ID_AMD,
> > .device = PCI_ANY_ID,
> > .subvendor = PCI_VENDOR_ID_AMD,
> > .subdevice = PCI_SUBSYSTEM_AMD_SMN,
> >
> > (pci_device_id.subdevice is misnamed; the spec calls it "Subsystem ID")
>
> Isn't the subsystem ID based typically upon the platform it's
> running on? For example I seem to recall on Dell systems it's used
> the value that was in the SBMIOS ProductSKU field here (IoW not
> something AMD would control).
Right, it is typically based on the platform; that's why I said "in
reverse." I think all these devices are integrated into the chipset,
so I'm speculating that platform vendors would have no need (maybe
even no way) to use the Subsystem ID. But maybe that's not the case.
> I mean I guess maybe we could do a:
>
> .vendor = PCI_VENDOR_ID_AMD,
> .device = PCI_ANY_ID,
> .class = PCI_CLASS_BRIDGE_HOST << 8
>
> And then in probe() figure out if it's the right one, but that's still
> pretty ugly, eh?
I think there are some drivers that do this, and it's not completely
terrible. The probe() can just return failure if it doesn't want the
device.
Bjorn
Powered by blists - more mailing lists