[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260116211135.GA959225@bhelgaas>
Date: Fri, 16 Jan 2026 15:11:35 -0600
From: Bjorn Helgaas <helgaas@...nel.org>
To: Haakon Bugge <haakon.bugge@...cle.com>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, Len Brown <lenb@...nel.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Greg Kroah-Hartman <gregkh@...e.de>,
Kenji Kaneshige <kaneshige.kenji@...fujitsu.com>,
"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Johannes Thumshirn <jth@...nel.org>,
Myron Stowe <myron.stowe@...hat.com>
Subject: Re: [PATCH] PCI/ACPI: Confine program_hpx_type2 to the AER bits
[+cc Johannes (author of e42010d8207f ("PCI: Set Read Completion
Boundary to 128 iff Root Port supports it (_HPX)"), Myron; start of
thread:
https://lore.kernel.org/r/20260113171522.3446407-1-haakon.bugge@oracle.com]
On Fri, Jan 16, 2026 at 10:10:43AM +0000, Haakon Bugge wrote:
> > On Thu, Jan 15, 2026 at 03:39:21PM +0000, Haakon Bugge wrote:
> >> Thanks for the review, Bjørn!
> >> ...
I should have mentioned this earlier, but I think the commit log
should include something about the problem this change fixes. I
assume that the current code changes ExtTag and/or RO, and that causes
something bad. That's what is motivating this change.
> >>>> if (pcie_cap_has_lnkctl(dev)) {
> >>>> + u16 lnkctl;
> >>>>
> >>>> - /*
> >>>> - * If the Root Port supports Read Completion Boundary of
> >>>> - * 128, set RCB to 128. Otherwise, clear it.
> >>>> - */
> >>>> - hpx->pci_exp_lnkctl_and |= PCI_EXP_LNKCTL_RCB;
> >>>> - hpx->pci_exp_lnkctl_or &= ~PCI_EXP_LNKCTL_RCB;
> >>>> - if (pcie_root_rcb_set(dev))
> >>>> - hpx->pci_exp_lnkctl_or |= PCI_EXP_LNKCTL_RCB;
> >>>> -
> >>>> - pcie_capability_clear_and_set_word(dev, PCI_EXP_LNKCTL,
> >>>> - ~hpx->pci_exp_lnkctl_and, hpx->pci_exp_lnkctl_or);
> >>>> + pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl);
> >>>> + if (lnkctl)
> >>>> + pci_warn(dev, "Some bits in PCIe Link Control are set: 0x%04x\n",
> >>>> + lnkctl);
> >>>>
> >>> Sorry, I wasn't clear about this. I meant that we could log the
> >>> LNKCTL AND/OR values from _HPX, not the values from
> >>> PCI_EXP_LNKCTL itself. There will definitely be bits set in
> >>> PCI_EXP_LNKCTL in normal operation, which is perfectly fine.
> >>>
> >>> But if pci_exp_lnkctl_and or pci_exp_lnkctl_or are non-zero, the
> >>> platform is telling us to do something, and we're ignoring it.
> >>> *That's* what I think we might want to know about. pci_info()
> >>> is probably sufficient; the user doesn't need to *do* anything
> >>> with it, I just want it in case we need to debug an issue.
> >>
> >> My bad, Yes, that makes more sense to me. And, you're OK with
> >> removing the RCB tweaking as well?
> >
> > Good question. My hope is that the code here is just to make sure
> > that we don't *clear* PCI_EXP_LNKCTL_RCB when we want it set but a
> > type 2 record might clear it by mistake.
>
> Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff
> Root Port supports it (_HPX)") fixes the "opposite" case, where _HPX
> sets the RCB even though the RC does not support it. That commit
> removes any RCB setting from the type 2 record from the equation,
> and sets RCB if the RC has the bit set. And to me, that seems to be
> the correct behaviour.
Thanks for digging into that. You're right that it looks like
e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff Root Port
supports it (_HPX)") was motivated by a machine with a Root Port with
PCI_EXP_LNKCTL_RCB cleared, but an _HPX record telling us to set
PCI_EXP_LNKCTL_RCB.
If we had realized at the time that _HPX Type 2 records are only for
the specific case of an OS that owns PCIe native hotplug but not AER,
we likely could have dropped the PCI_EXP_LNKCTL update completely.
The current behavior is that *if* there is an _HPX Type 2 record, we
set RCB to match the Root Port's RCB, regardless of what the Type 2
record is telling us. Removing that PCI_EXP_LNKCTL update could
conceivably break something if:
- Platform has an _HPX Type 2 record, and
- Root Port has PCI_EXP_LNKCTL_RCB cleared (only supports 64-byte
Completions), and
- Endpoint has PCI_EXP_LNKCTL_RCB set (may return 128-byte
Completions), which would probably be a configuration error by
BIOS
Removing it would also give up a little performance if there's a Type
2 record, the Root Port has PCI_EXP_LNKCTL_RCB set (supports 128-byte
Completions), but the Endpoint has PCI_EXP_LNKCTL_RCB cleared (may
only return 64-byte Completions).
> > We should audit PCI_EXP_LNKCTL_RCB usage to be sure that if we
> > remove this code, PCI_EXP_LNKCTL_RCB will still be set whenever it
> > needs to be set. If we rely on the existence of an _HPX type 2
> > record for it to be set, that would be completely wrong.
>
> I'll keep the RCB tweaking as is and add the pci_info() logging if
> the type 2 record attempts to change any bits in the Link Control
> register.
I think the fact that we only tweak RCB when an _HPX Type 2 record
exists is bogus. As far as I can tell, RCB has nothing to do with AER
or hotplug.
I think we probably should add a pci_configure_rcb() called from
pci_configure_device() to always configure RCB to match the Root Port
RCB. I would do this in a separate patch before removing the update
from program_hpx_type2().
Bjorn
Powered by blists - more mailing lists