[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2b9f7f7b-d6a4-be59-14d4-7b4ffccfe373@linux.intel.com>
Date: Wed, 1 Oct 2025 13:55:51 +0300 (EEST)
From: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To: dan.j.williams@...el.com
cc: Xu Yilun <yilun.xu@...ux.intel.com>, linux-coco@...ts.linux.dev,
linux-pci@...r.kernel.org, yilun.xu@...el.com, baolu.lu@...ux.intel.com,
zhenzhong.duan@...el.com, aneesh.kumar@...nel.org, bhelgaas@...gle.com,
aik@....com, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/3] PCI/IDE: Add Address Association Register setup for
RP
On Tue, 30 Sep 2025, dan.j.williams@...el.com wrote:
> Ilpo Järvinen wrote:
> > On Sun, 28 Sep 2025, Xu Yilun wrote:
> >
> > > Add Address Association Register setup for Root Ports.
> > >
> > > The address ranges for RP side Address Association Registers should
> > > cover memory addresses for all PFs/VFs/downstream devices of the DSM
> > > device. A simple solution is to get the aggregated 32-bit and 64-bit
> > > address ranges from directly connected downstream port (either an RP or
> > > a switch port) and set into 2 Address Association Register blocks.
> > >
> > > There is a case the platform doesn't require Address Association
> > > Registers setup and provides no register block for RP (AMD). Will skip
> > > the setup in pci_ide_stream_setup().
> > >
> > > Also imaging another case where there is only one block for RP.
> > > Prioritize 64-bit address ranges setup for it. No strong reason for the
> > > preference until a real use case comes.
> > >
> > > The Address Association Register setup for Endpoint Side is still
> > > uncertain so isn't supported in this patch.
> > >
> > > Take the oppotunity to export some mini helpers for Address Association
> > > Registers setup. TDX Connect needs the provided aggregated address
> > > ranges but will use specific firmware calls for actual setup instead of
> > > pci_ide_stream_setup().
> > >
> > > Co-developed-by: Aneesh Kumar K.V <aneesh.kumar@...nel.org>
> > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@...nel.org>
> > > Co-developed-by: Arto Merilainen <amerilainen@...dia.com>
> > > Signed-off-by: Arto Merilainen <amerilainen@...dia.com>
> > > Signed-off-by: Xu Yilun <yilun.xu@...ux.intel.com>
> > > ---
> > > include/linux/pci-ide.h | 11 +++++++
> > > drivers/pci/ide.c | 64 ++++++++++++++++++++++++++++++++++++++++-
> > > 2 files changed, 74 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/linux/pci-ide.h b/include/linux/pci-ide.h
> > > index 5adbd8b81f65..ac84fb611963 100644
> > > --- a/include/linux/pci-ide.h
> > > +++ b/include/linux/pci-ide.h
> > > @@ -6,6 +6,15 @@
> > > #ifndef __PCI_IDE_H__
> > > #define __PCI_IDE_H__
> > >
> > > +#define SEL_ADDR1_LOWER GENMASK(31, 20)
> > > +#define SEL_ADDR_UPPER GENMASK_ULL(63, 32)
> > > +#define PREP_PCI_IDE_SEL_ADDR1(base, limit) \
> > > + (FIELD_PREP(PCI_IDE_SEL_ADDR_1_VALID, 1) | \
> > > + FIELD_PREP(PCI_IDE_SEL_ADDR_1_BASE_LOW, \
> > > + FIELD_GET(SEL_ADDR1_LOWER, (base))) | \
> > > + FIELD_PREP(PCI_IDE_SEL_ADDR_1_LIMIT_LOW, \
> > > + FIELD_GET(SEL_ADDR1_LOWER, (limit))))
> > > +
> > > #define PREP_PCI_IDE_SEL_RID_2(base, domain) \
> > > (FIELD_PREP(PCI_IDE_SEL_RID_2_VALID, 1) | \
> > > FIELD_PREP(PCI_IDE_SEL_RID_2_BASE, (base)) | \
> > > @@ -42,6 +51,8 @@ struct pci_ide_partner {
> > > unsigned int default_stream:1;
> > > unsigned int setup:1;
> > > unsigned int enable:1;
> > > + struct range mem32;
> > > + struct range mem64;
> > > };
> > >
> > > /**
> > > diff --git a/drivers/pci/ide.c b/drivers/pci/ide.c
> > > index 7633b8e52399..8db1163737e5 100644
> > > --- a/drivers/pci/ide.c
> > > +++ b/drivers/pci/ide.c
> > > @@ -159,7 +159,11 @@ struct pci_ide *pci_ide_stream_alloc(struct pci_dev *pdev)
> > > struct stream_index __stream[PCI_IDE_HB + 1];
> > > struct pci_host_bridge *hb;
> > > struct pci_dev *rp;
> > > + struct pci_dev *br;
> >
> > Why not join with the previous line?
> >
> > > int num_vf, rid_end;
> > > + struct range mem32 = {}, mem64 = {};
> > > + struct pci_bus_region region;
> > > + struct resource *res;
> > >
> > > if (!pci_is_pcie(pdev))
> > > return NULL;
> > > @@ -206,6 +210,24 @@ struct pci_ide *pci_ide_stream_alloc(struct pci_dev *pdev)
> > > else
> > > rid_end = pci_dev_id(pdev);
> > >
> > > + br = pci_upstream_bridge(pdev);
> > > + if (!br)
> > > + return NULL;
> > > +
> > > + res = &br->resource[PCI_BRIDGE_MEM_WINDOW];
> >
> > pci_resource_n()
> >
> > > + if (res->flags & IORESOURCE_MEM) {
> > > + pcibios_resource_to_bus(br->bus, ®ion, res);
> > > + mem32.start = region.start;
> > > + mem32.end = region.end;
> > > + }
> > > +
> > > + res = &br->resource[PCI_BRIDGE_PREF_MEM_WINDOW];
> >
> > Ditto.
> >
> > > + if (res->flags & IORESOURCE_PREFETCH) {
> >
> > While I don't know much about what's going on here, is this assuming the
> > bridge window is not disabled solely based on this flag check?
>
> Indeed it does seem to be assumining that the flag is only set when the
> resource is valid and active.
>
> > Previously inactive bridge window flags were reset but that's no longer
> > the case after the commit 8278c6914306 ("PCI: Preserve bridge window
> > resource type flags") (currently in pci/resource)?
>
> Thanks for the heads up. It does seem odd that both IORESOURCE_UNSET and
> IORESOURCE_DISABLED are both being set and the check allows for either.
I'm a bit lost on what check you're referring to.
If you refer to the check in pci_bus_alloc_from_region() added by that
commit, now that I relook at it, it would probably be better written as
!r->parent (a TODO entry added to verify it).
> Is that assuming that other call paths not touched in that set may only
> set one of those flags?
Presence of either of those flags indicates the bridge window resource is
not usable "normally". There's also res->parent which directly tells if
the resource is assigned. Out of those three, res->parent is the preferred
way to know if the resource is usable normally (aka. "assigned"), however,
res->parent check can only be used if this code runs late enough.
To me IORESOURCE_UNSET looks unnecessary flag and would want to get rid of
it entirely as res->parent mostly tells the same information. But I don't
expect that to be an easy change, and there's also the init transient
where res->parent is not yet set which complicates things.
But until IORESOURCE_UNSET is gone, it alone can indicate the resource is
not in usable state. And so can IORESOURCE_DISABLED.
The resource fitting code clears DISABLED (while sizing bridge windows)
before UNSET (on assignment), so they have different meaning even if
there's overlap on the consumer side depending on use case. The resource
fitting/assignment code cares for this distinction, see e.g.
pdev_resource_assignable() which only checks for DISABLED because, well,
we're about to attempt to turn UNSET into !UNSET.
> Otherwise, the change to mark the resource as zero-sized feels a better
> / more explicit protocol than checking for flags. IDE setup only cares
> that any downstream MMIO get included in the stream.
If this particular code here runs after resources have been assigned by
the kernel, please check res->parent to know if the resource is assigned
or not.
I'm considering adding resource_assigned() helper for this purpose as
res->parent check looks too odd and may alienate developers from using it
if they don't know about the internals of the resource management.
If the bridge window resource is assigned, it should have the expected
flags and IMO it's useless to check for the flags (if flags are not right
for the bridge window resources that is assigned, we've a bug elsewhere in
the code).
As a sidenote, there are lots of !res->flags and !pci_resource_len(...),
etc. checks which are often custom implementations resource_assigned(),
they all are landmines that make my life harder as I'd want to make
further improvements to resource behavior.
--
i.
Powered by blists - more mailing lists