[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0018d26a-ffd5-42fc-8cc4-9e689834a808@amd.com>
Date: Thu, 24 Oct 2024 16:20:35 -0500
From: Mario Limonciello <mario.limonciello@....com>
To: Bjorn Helgaas <helgaas@...nel.org>
Cc: Yazen Ghannam <yazen.ghannam@....com>, linux-edac@...r.kernel.org,
linux-kernel@...r.kernel.org, tony.luck@...el.com, x86@...nel.org,
avadhut.naik@....com, john.allen@....com, bhelgaas@...gle.com,
Shyam-sundar.S-k@....com, richard.gong@....com, jdelvare@...e.com,
linux@...ck-us.net, clemens@...isch.de, hdegoede@...hat.com,
ilpo.jarvinen@...ux.intel.com, linux-pci@...r.kernel.org,
linux-hwmon@...r.kernel.org, platform-driver-x86@...r.kernel.org,
naveenkrishna.chatradhi@....com, carlos.bilbao.osdev@...il.com
Subject: Re: [PATCH 00/16] AMD NB and SMN rework
On 10/24/2024 16:06, Bjorn Helgaas wrote:
> On Thu, Oct 24, 2024 at 03:08:41PM -0500, Mario Limonciello wrote:
>> On 10/24/2024 12:46, Bjorn Helgaas wrote:
>>> On Thu, Oct 24, 2024 at 12:01:59PM -0400, Yazen Ghannam wrote:
>>>> On Wed, Oct 23, 2024 at 12:59:28PM -0500, Bjorn Helgaas wrote:
>>>>> On Wed, Oct 23, 2024 at 05:21:34PM +0000, Yazen Ghannam wrote:
>> ...
>
>>>>> The use of pci_get_slot() and pci_get_domain_bus_and_slot() is not
>>>>> ideal since all those pci_get_*() interfaces are kind of ugly in my
>>>>> opinion, and using them means we have to encode topology details in
>>>>> the kernel. But this still seems like a big improvement.
>>>>
>>>> Thanks for the feedback. Hopefully, we'll come to some improved
>>>> solution. :)
>>>>
>>>> Can you please elaborate on your concern? Is it about saying "thing X is
>>>> always at SBDF A:B:C.D" or something else?
>>>
>>> "Thing X is always at SBDF A:B:C.D" is one big reason. "A:B:C.D" says
>>> nothing about the actual functionality of the device. A PCI
>>> Vendor/Device ID or a PNP ID identifies the device programming model
>>> independent of its geographical location. Inferring the functionality
>>> and programming model from the location is a maintenance issue because
>>> hardware may change the address.
>>>
>>> PCI bus numbers are under software control, so in general it's not
>>> safe to rely on them, although in this case these devices are probably
>>> on root buses where the bus number is either fixed or determined by
>>> BIOS configuration of the host bridge.
>>>
>>> I don't like the pci_get_*() functions because they break the driver
>>> model. The usual .probe() model binds a device to a driver, which
>>> essentially means the driver owns the device and its resources, and
>>> the driver and doesn't have to worry about other code interfering.
>>
>> Are you suggesting that perhaps we should be introducing amd_smn (patch 10)
>> as a PCI driver that binds "to the root device" instead?
>
> I don't know any of the specifics, so I can't really opine on that.
>
> The PCI specs envision that a Vendor/Device ID defines the programming
> model of the device, and you would only use a new Device ID when that
> programming model changes.
>
> Of course, vendors like to define a new set of Device IDs for every
> new chipset even when no driver changes are required, so even if a new
> SMN works exactly the same as in previous chipsets, you're probably
> back to having to add a new Device ID for every new chipset.
Yeah; this I believe is why we're here today and trying to find
something more manageable (IE this series).
>
> The Subsystem Vendor ID and Subsystem ID exist to solve a similar
> problem (sort of in reverse). If AMD could allocate a Subsystem ID
> for this SMN programming model and use that same ID in every chipset,
> you could make a pci_driver.id_table entry that would match them all,
> e.g.,
>
> .vendor = PCI_VENDOR_ID_AMD,
> .device = PCI_ANY_ID,
> .subvendor = PCI_VENDOR_ID_AMD,
> .subdevice = PCI_SUBSYSTEM_AMD_SMN,
>
> (pci_device_id.subdevice is misnamed; the spec calls it "Subsystem ID")
Isn't the subsystem ID based typically upon the platform it's running
on? For example I seem to recall on Dell systems it's used the value
that was in the SBMIOS ProductSKU field here (IoW not something AMD
would control).
I mean I guess maybe we could do a:
.vendor = PCI_VENDOR_ID_AMD,
.device = PCI_ANY_ID,
.class = PCI_CLASS_BRIDGE_HOST << 8
And then in probe() figure out if it's the right one, but that's still
pretty ugly, eh?
>
>> There are some areas that do discovery (for example amd_node_get_root() in
>> patch 6/16).
>
> Sort of. amd_node_get_root() and amd_node_get_func() both just grub
> through all the devices that the PCI core has enumerated and return
> the one that has the right geographical address.
>
> There's no binding to a driver, so another driver could come along and
> bind to the same device, and then you have a potential conflict.
>
> You also give up all the standard driver model infrastructure for
> hotplug, power management, etc. Granted, you probably don't care
> about those things here.
Right; I agree none of that matters here.
Powered by blists - more mailing lists