lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7D64C761-1928-44CA-9548-106F7DCA2DDE@oracle.com>
Date: Thu, 22 Jan 2026 11:12:57 +0000
From: Haakon Bugge <haakon.bugge@...cle.com>
To: Bjorn Helgaas <helgaas@...nel.org>
CC: Bjorn Helgaas <bhelgaas@...gle.com>,
        Johannes Thumshirn
	<morbidrsa@...il.com>,
        "linux-pci@...r.kernel.org"
	<linux-pci@...r.kernel.org>,
        "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>,
        "linux-acpi@...r.kernel.org"
	<linux-acpi@...r.kernel.org>,
        Alex Williamson <alex@...zbot.org>,
        Niklas
 Schnelle <schnelle@...ux.ibm.com>
Subject: Re: [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device



> On 22 Jan 2026, at 12:04, Bjorn Helgaas <helgaas@...nel.org> wrote:
> 
> On Thu, Jan 22, 2026 at 10:49:45AM +0000, Haakon Bugge wrote:
>>> On 22 Jan 2026, at 11:36, Bjorn Helgaas <helgaas@...nel.org> wrote:
>>> On Wed, Jan 21, 2026 at 04:40:10PM -0600, Bjorn Helgaas wrote:
>>>> On Wed, Jan 21, 2026 at 12:35:40PM +0100, Håkon Bugge wrote:
>>>>> Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff
>>>>> Root Port supports it (_HPX)") fixed a bogus _HPX type 2 record, which
>>>>> instructed program_hpx_type2() to set the RCB in an endpoint,
>>>>> although it's RC did not have the RCB bit set.
>>>>> 
>>>>> e42010d8207f fixed that by qualifying the setting of the RCB in the
>>>>> endpoint with the RC supporting an 128 byte RCB.
>>>>> 
>>>>> In retrospect, the program_hpx_type2() should only modify the AER
>>>>> bits, and stay away from fiddling with the Link Control Register.
>>>>> 
>>>>> Hence, we explicitly program the RCB from pci_configure_device(). RCB
>>>>> is RO in Root Ports, and in VFs, the bit is RsvdP, so for these two
>>>>> cases we skip programming it. Then, if the Root Port has RCB set and
>>>>> it is not set in the EP, we set it.
>>>>> 
>>>>> Fixes: Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)")
>>>>> Signed-off-by: Håkon Bugge <haakon.bugge@...cle.com>
>>>>> 
>>>>> ---
>>>>> 
>>>>> Note, that the current duplication of pcie_root_rcb_set() will be
>>>>> removed in the next commit.
>>>>> ---
>>>>> drivers/pci/probe.c | 36 ++++++++++++++++++++++++++++++++++++
>>>>> 1 file changed, 36 insertions(+)
>>>>> 
>>>>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>>>>> index 41183aed8f5d9..347af29868124 100644
>>>>> --- a/drivers/pci/probe.c
>>>>> +++ b/drivers/pci/probe.c
>>>>> @@ -2410,6 +2410,41 @@ static void pci_configure_serr(struct pci_dev *dev)
>>>>> }
>>>>> }
>>>>> 
>>>>> +static bool pcie_root_rcb_set(struct pci_dev *dev)
>>>>> +{
>>>>> + struct pci_dev *rp = pcie_find_root_port(dev);
>>>>> + u16 lnkctl;
>>>>> +
>>>>> + if (!rp)
>>>>> + return false;
>>>>> +
>>>>> + pcie_capability_read_word(rp, PCI_EXP_LNKCTL, &lnkctl);
>>>>> +
>>>>> + return !!(lnkctl & PCI_EXP_LNKCTL_RCB);
>>>>> +}
>>>>> +
>>>>> +static void pci_configure_rcb(struct pci_dev *dev)
>>>>> +{
>>>>> + /*
>>>>> +  * Obviously, we need a Link Control register. The RCB is RO
>>>>> +  * in Root Ports, so no need to attempt to set it for
>>>>> +  * them. For VFs, the RCB is RsvdP, so, no need to set it.
>>>>> +  * Then, if the Root Port has RCB set, then we set for the EP
>>>>> +  * unless already set.
>>>>> +  */
>>>>> + if (pcie_cap_has_lnkctl(dev) &&
>>>>> +     (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) &&
>>>>> +     !dev->is_virtfn && pcie_root_rcb_set(dev)) {
>>>>> + u16 lnkctl;
>>>>> +
>>>>> + pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl);
>>>>> + if (lnkctl & PCI_EXP_LNKCTL_RCB)
>>>>> + return;
>>>>> +
>>>>> + pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnkctl | PCI_EXP_LNKCTL_RCB);
>>>>> + }
>>>> 
>>>> RCB isn't meaningful for switches, so we'll read their LNKCTL
>>>> unnecessarily.  I propose something like this, which also clears RCB
>>>> if it's set when it shouldn't be (I think this would indicate a
>>>> firmware defect):
>>>> 
>>>>       /*
>>>>        * Per PCIe r7.0, sec 7.5.3.7, RCB is only meaningful in Root Ports
>>>>        * (where it is read-only), Endpoints, and Bridges.  It may only be
>>>>        * set for Endpoints and Bridges if it is set in the Root Port.
>>>>        */
>>>>       if (!pci_is_pcie(dev) ||
>>>>           pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
>>>>           pci_pcie_type(dev) == PCI_EXP_TYPE_UPSTREAM ||
>>>>           pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM ||
>>>>           dev->is_virtfn)
>>>>               return;
>>>> 
>>>>       rcb = pcie_root_rcb_set(dev);
>>>> 
>>>>       pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl);
>>>>       if (rcb) {
>>>>               if (lnkctl & PCI_EXP_LNKCTL_RCB)
>>>>                       return;
>>>> 
>>>>               lnkctl |= PCI_EXP_LNKCTL_RCB;
>>>>       } else {
>>>>               if (!(lnkctl & PCI_EXP_LNKCTL_RCB))
>>>>                       return;
>>>> 
>>>>               pci_info(FW_INFO "clearing RCB (RCB not set in Root Port)\n");
>>>>               lnkctl &= ~PCI_EXP_LNKCTL_RCB;
>>> 
>>> On second thought, I think this is too aggressive.  I think VM
>>> guests will often see endpoints but not the Root Port.  In that
>>> case, pcie_root_rcb_set() would return false because it couldn't
>>> find the RP, but the RP might actually have RCB set.  Then we
>>> would clear the endpoint RCB unnecessarily, which should be safe
>>> but would reduce performance and will result in annoying
>>> misleading warnings.
>> 
>> If the VM has a PF in pass-through and the RP is not there, you're
>> right. If it is a VF, we do not program it anyway.
>> 
>>> Could either ignore this case (as in your original patch) or bring
>>> pcie_root_rcb_set() inline here and return early if we can't find
>>> the RP.
>> 
>> I think pcie_root_rcb_set() should return true when it was able to
>> retrieve the RP's RCB value, and if not, we bail out.
> 
> Currently it returns:
> 
>  - true if we found the RP and it had RCB set
> 
>  - false if (a) we couldn't find the RP or (b) we found the RP and it
>    had RCB cleared
> 
> I don't think it's worth complicating the signature to make (a) and
> (b) distinguishable.

Case (b) will not trustfully detect the case where the EP's RCB is set and the RP is (b.1) not found or (b.2) RP is found but it's RCB is not set. In case (b.2), we _shall_ reset the EP's RCB, IIUC.


Thxs, Håkon



> 
> Inlining pcie_root_rcb_set() would also remove any ambiguity about
> whether pcie_root_rcb_set() actually *sets* RCB or just tests it.  I
> think any readability advantage of using a separate function is
> minimal.
> 
>>>>       }
>>>> 
>>>>       pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnkctl);
>>>> 
>>>>> +}
>>>>> +
>>>>> static void pci_configure_device(struct pci_dev *dev)
>>>>> {
>>>>> pci_configure_mps(dev);
>>>>> @@ -2419,6 +2454,7 @@ static void pci_configure_device(struct pci_dev *dev)
>>>>> pci_configure_aspm_l1ss(dev);
>>>>> pci_configure_eetlp_prefix(dev);
>>>>> pci_configure_serr(dev);
>>>>> + pci_configure_rcb(dev);
>>>>> 
>>>>> pci_acpi_program_hp_params(dev);
>>>>> }
>>>>> -- 
>>>>> 2.43.5
>> 
>> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ