[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260122110415.GA1241583@bhelgaas>
Date: Thu, 22 Jan 2026 05:04:15 -0600
From: Bjorn Helgaas <helgaas@...nel.org>
To: Haakon Bugge <haakon.bugge@...cle.com>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>,
Johannes Thumshirn <morbidrsa@...il.com>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
Alex Williamson <alex@...zbot.org>,
Niklas Schnelle <schnelle@...ux.ibm.com>
Subject: Re: [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device
On Thu, Jan 22, 2026 at 10:49:45AM +0000, Haakon Bugge wrote:
> > On 22 Jan 2026, at 11:36, Bjorn Helgaas <helgaas@...nel.org> wrote:
> > On Wed, Jan 21, 2026 at 04:40:10PM -0600, Bjorn Helgaas wrote:
> >> On Wed, Jan 21, 2026 at 12:35:40PM +0100, Håkon Bugge wrote:
> >>> Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff
> >>> Root Port supports it (_HPX)") fixed a bogus _HPX type 2 record, which
> >>> instructed program_hpx_type2() to set the RCB in an endpoint,
> >>> although it's RC did not have the RCB bit set.
> >>>
> >>> e42010d8207f fixed that by qualifying the setting of the RCB in the
> >>> endpoint with the RC supporting an 128 byte RCB.
> >>>
> >>> In retrospect, the program_hpx_type2() should only modify the AER
> >>> bits, and stay away from fiddling with the Link Control Register.
> >>>
> >>> Hence, we explicitly program the RCB from pci_configure_device(). RCB
> >>> is RO in Root Ports, and in VFs, the bit is RsvdP, so for these two
> >>> cases we skip programming it. Then, if the Root Port has RCB set and
> >>> it is not set in the EP, we set it.
> >>>
> >>> Fixes: Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)")
> >>> Signed-off-by: Håkon Bugge <haakon.bugge@...cle.com>
> >>>
> >>> ---
> >>>
> >>> Note, that the current duplication of pcie_root_rcb_set() will be
> >>> removed in the next commit.
> >>> ---
> >>> drivers/pci/probe.c | 36 ++++++++++++++++++++++++++++++++++++
> >>> 1 file changed, 36 insertions(+)
> >>>
> >>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> >>> index 41183aed8f5d9..347af29868124 100644
> >>> --- a/drivers/pci/probe.c
> >>> +++ b/drivers/pci/probe.c
> >>> @@ -2410,6 +2410,41 @@ static void pci_configure_serr(struct pci_dev *dev)
> >>> }
> >>> }
> >>>
> >>> +static bool pcie_root_rcb_set(struct pci_dev *dev)
> >>> +{
> >>> + struct pci_dev *rp = pcie_find_root_port(dev);
> >>> + u16 lnkctl;
> >>> +
> >>> + if (!rp)
> >>> + return false;
> >>> +
> >>> + pcie_capability_read_word(rp, PCI_EXP_LNKCTL, &lnkctl);
> >>> +
> >>> + return !!(lnkctl & PCI_EXP_LNKCTL_RCB);
> >>> +}
> >>> +
> >>> +static void pci_configure_rcb(struct pci_dev *dev)
> >>> +{
> >>> + /*
> >>> + * Obviously, we need a Link Control register. The RCB is RO
> >>> + * in Root Ports, so no need to attempt to set it for
> >>> + * them. For VFs, the RCB is RsvdP, so, no need to set it.
> >>> + * Then, if the Root Port has RCB set, then we set for the EP
> >>> + * unless already set.
> >>> + */
> >>> + if (pcie_cap_has_lnkctl(dev) &&
> >>> + (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) &&
> >>> + !dev->is_virtfn && pcie_root_rcb_set(dev)) {
> >>> + u16 lnkctl;
> >>> +
> >>> + pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl);
> >>> + if (lnkctl & PCI_EXP_LNKCTL_RCB)
> >>> + return;
> >>> +
> >>> + pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnkctl | PCI_EXP_LNKCTL_RCB);
> >>> + }
> >>
> >> RCB isn't meaningful for switches, so we'll read their LNKCTL
> >> unnecessarily. I propose something like this, which also clears RCB
> >> if it's set when it shouldn't be (I think this would indicate a
> >> firmware defect):
> >>
> >> /*
> >> * Per PCIe r7.0, sec 7.5.3.7, RCB is only meaningful in Root Ports
> >> * (where it is read-only), Endpoints, and Bridges. It may only be
> >> * set for Endpoints and Bridges if it is set in the Root Port.
> >> */
> >> if (!pci_is_pcie(dev) ||
> >> pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
> >> pci_pcie_type(dev) == PCI_EXP_TYPE_UPSTREAM ||
> >> pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM ||
> >> dev->is_virtfn)
> >> return;
> >>
> >> rcb = pcie_root_rcb_set(dev);
> >>
> >> pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl);
> >> if (rcb) {
> >> if (lnkctl & PCI_EXP_LNKCTL_RCB)
> >> return;
> >>
> >> lnkctl |= PCI_EXP_LNKCTL_RCB;
> >> } else {
> >> if (!(lnkctl & PCI_EXP_LNKCTL_RCB))
> >> return;
> >>
> >> pci_info(FW_INFO "clearing RCB (RCB not set in Root Port)\n");
> >> lnkctl &= ~PCI_EXP_LNKCTL_RCB;
> >
> > On second thought, I think this is too aggressive. I think VM
> > guests will often see endpoints but not the Root Port. In that
> > case, pcie_root_rcb_set() would return false because it couldn't
> > find the RP, but the RP might actually have RCB set. Then we
> > would clear the endpoint RCB unnecessarily, which should be safe
> > but would reduce performance and will result in annoying
> > misleading warnings.
>
> If the VM has a PF in pass-through and the RP is not there, you're
> right. If it is a VF, we do not program it anyway.
>
> > Could either ignore this case (as in your original patch) or bring
> > pcie_root_rcb_set() inline here and return early if we can't find
> > the RP.
>
> I think pcie_root_rcb_set() should return true when it was able to
> retrieve the RP's RCB value, and if not, we bail out.
Currently it returns:
- true if we found the RP and it had RCB set
- false if (a) we couldn't find the RP or (b) we found the RP and it
had RCB cleared
I don't think it's worth complicating the signature to make (a) and
(b) distinguishable.
Inlining pcie_root_rcb_set() would also remove any ambiguity about
whether pcie_root_rcb_set() actually *sets* RCB or just tests it. I
think any readability advantage of using a separate function is
minimal.
> >> }
> >>
> >> pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnkctl);
> >>
> >>> +}
> >>> +
> >>> static void pci_configure_device(struct pci_dev *dev)
> >>> {
> >>> pci_configure_mps(dev);
> >>> @@ -2419,6 +2454,7 @@ static void pci_configure_device(struct pci_dev *dev)
> >>> pci_configure_aspm_l1ss(dev);
> >>> pci_configure_eetlp_prefix(dev);
> >>> pci_configure_serr(dev);
> >>> + pci_configure_rcb(dev);
> >>>
> >>> pci_acpi_program_hp_params(dev);
> >>> }
> >>> --
> >>> 2.43.5
>
>
Powered by blists - more mailing lists