[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241024210652.GA1003184@bhelgaas>
Date: Thu, 24 Oct 2024 16:06:52 -0500
From: Bjorn Helgaas <helgaas@...nel.org>
To: Mario Limonciello <mario.limonciello@....com>
Cc: Yazen Ghannam <yazen.ghannam@....com>, linux-edac@...r.kernel.org,
linux-kernel@...r.kernel.org, tony.luck@...el.com, x86@...nel.org,
avadhut.naik@....com, john.allen@....com, bhelgaas@...gle.com,
Shyam-sundar.S-k@....com, richard.gong@....com, jdelvare@...e.com,
linux@...ck-us.net, clemens@...isch.de, hdegoede@...hat.com,
ilpo.jarvinen@...ux.intel.com, linux-pci@...r.kernel.org,
linux-hwmon@...r.kernel.org, platform-driver-x86@...r.kernel.org,
naveenkrishna.chatradhi@....com, carlos.bilbao.osdev@...il.com
Subject: Re: [PATCH 00/16] AMD NB and SMN rework
On Thu, Oct 24, 2024 at 03:08:41PM -0500, Mario Limonciello wrote:
> On 10/24/2024 12:46, Bjorn Helgaas wrote:
> > On Thu, Oct 24, 2024 at 12:01:59PM -0400, Yazen Ghannam wrote:
> > > On Wed, Oct 23, 2024 at 12:59:28PM -0500, Bjorn Helgaas wrote:
> > > > On Wed, Oct 23, 2024 at 05:21:34PM +0000, Yazen Ghannam wrote:
> ...
> > > > The use of pci_get_slot() and pci_get_domain_bus_and_slot() is not
> > > > ideal since all those pci_get_*() interfaces are kind of ugly in my
> > > > opinion, and using them means we have to encode topology details in
> > > > the kernel. But this still seems like a big improvement.
> > >
> > > Thanks for the feedback. Hopefully, we'll come to some improved
> > > solution. :)
> > >
> > > Can you please elaborate on your concern? Is it about saying "thing X is
> > > always at SBDF A:B:C.D" or something else?
> >
> > "Thing X is always at SBDF A:B:C.D" is one big reason. "A:B:C.D" says
> > nothing about the actual functionality of the device. A PCI
> > Vendor/Device ID or a PNP ID identifies the device programming model
> > independent of its geographical location. Inferring the functionality
> > and programming model from the location is a maintenance issue because
> > hardware may change the address.
> >
> > PCI bus numbers are under software control, so in general it's not
> > safe to rely on them, although in this case these devices are probably
> > on root buses where the bus number is either fixed or determined by
> > BIOS configuration of the host bridge.
> >
> > I don't like the pci_get_*() functions because they break the driver
> > model. The usual .probe() model binds a device to a driver, which
> > essentially means the driver owns the device and its resources, and
> > the driver and doesn't have to worry about other code interfering.
>
> Are you suggesting that perhaps we should be introducing amd_smn (patch 10)
> as a PCI driver that binds "to the root device" instead?
I don't know any of the specifics, so I can't really opine on that.
The PCI specs envision that a Vendor/Device ID defines the programming
model of the device, and you would only use a new Device ID when that
programming model changes.
Of course, vendors like to define a new set of Device IDs for every
new chipset even when no driver changes are required, so even if a new
SMN works exactly the same as in previous chipsets, you're probably
back to having to add a new Device ID for every new chipset.
The Subsystem Vendor ID and Subsystem ID exist to solve a similar
problem (sort of in reverse). If AMD could allocate a Subsystem ID
for this SMN programming model and use that same ID in every chipset,
you could make a pci_driver.id_table entry that would match them all,
e.g.,
.vendor = PCI_VENDOR_ID_AMD,
.device = PCI_ANY_ID,
.subvendor = PCI_VENDOR_ID_AMD,
.subdevice = PCI_SUBSYSTEM_AMD_SMN,
(pci_device_id.subdevice is misnamed; the spec calls it "Subsystem ID")
> There are some areas that do discovery (for example amd_node_get_root() in
> patch 6/16).
Sort of. amd_node_get_root() and amd_node_get_func() both just grub
through all the devices that the PCI core has enumerated and return
the one that has the right geographical address.
There's no binding to a driver, so another driver could come along and
bind to the same device, and then you have a potential conflict.
You also give up all the standard driver model infrastructure for
hotplug, power management, etc. Granted, you probably don't care
about those things here.
Bjorn
Powered by blists - more mailing lists