lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <45809733-1e02-0109-a929-3cdd6c960646@linux.intel.com>
Date: Mon, 2 Jun 2025 08:10:42 +0300 (EEST)
From: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
cc: Dan Williams <dan.j.williams@...el.com>, Lukas Wunner <lukas@...ner.de>, 
    Bjorn Helgaas <bhelgaas@...gle.com>, Thomas Gleixner <tglx@...utronix.de>, 
    Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, 
    Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org, 
    "H. Peter Anvin" <hpa@...or.com>, linux-pci@...r.kernel.org, 
    LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 1/1] PCI: Add Extended Tag + MRRS quirk for Xeon 6

On Sun, 1 Jun 2025, Manivannan Sadhasivam wrote:

> On Tue, Apr 22, 2025 at 04:02:07PM +0300, Ilpo Järvinen wrote:
> > When bifurcated to x2, Xeon 6 Root Port performance is sensitive to the
> > configuration of Extended Tags, Max Read Request Size (MRRS), and 10-Bit
> > Tag Requester (note: there is currently no 10-Bit Tag support in the
> > kernel). While those can be configured to the recommended values by FW,
> > kernel may decide to overwrite the initial values.
> > 
> > Unfortunately, there is no mechanism for FW to indicate OS which parts
> > of PCIe configuration should not be altered. Thus, the only option is
> > to add such logic into the kernel as quirks.
> > 
> > There is a pre-existing quirk flag to disable Extended Tags. Depending
> > on CONFIG_PCIE_BUS_* setting, MRRS may be overwritten by what the
> > kernel thinks is the best for performance (the largest supported
> > value), resulting in performance degradation instead with these Root
> > Ports. (There would have been a pre-existing quirk to disallow
> > increasing MRRS but it is not identical to rejecting >128B MRRS.)
> > 
> > Add a quirk that disallows enabling Extended Tags and setting MRRS
> > larger than 128B for devices under Xeon 6 Root Ports if the Root Port is
> > bifurcated to x2. Reject >128B MRRS only when it is going to be written
> > by the kernel (this assumes FW configured a good initial value for MRRS
> > in case the kernel is not touching MRRS at all).
> > 
> > It was first attempted to always write MRRS when the quirk is needed
> > (always overwrite the initial value). That turned out to be quite
> > invasive change, however, given the complexity of the initial setup
> > callchain and various stages returning early when they decide no changes
> > are necessary, requiring override each. As such, the initial value for
> > MRRS is now left into the hands of FW.
> > 
> > Link: https://cdrdv2.intel.com/v1/dl/getContent/837176
> > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
> > ---
> > 
> > v2:
> > - Explain in changelog why FW cannot solve this on its own
> > - Moved the quirk under arch/x86/pci/
> > - Don't NULL check value from pci_find_host_bridge()
> > - Added comment above the quirk about the performance degradation
> > - Removed all setup chain 128B quirk overrides expect for MRRS write
> >   itself (assumes a sane initial value is set by FW)
> > 
> >  arch/x86/pci/fixup.c | 30 ++++++++++++++++++++++++++++++
> >  drivers/pci/pci.c    | 15 ++++++++-------
> >  include/linux/pci.h  |  1 +
> >  3 files changed, 39 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
> > index efefeb82ab61..aa9617bc4b55 100644
> > --- a/arch/x86/pci/fixup.c
> > +++ b/arch/x86/pci/fixup.c
> > @@ -294,6 +294,36 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL,	PCI_DEVICE_ID_INTEL_MCH_PB1,	pcie_r
> >  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL,	PCI_DEVICE_ID_INTEL_MCH_PC,	pcie_rootport_aspm_quirk);
> >  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL,	PCI_DEVICE_ID_INTEL_MCH_PC1,	pcie_rootport_aspm_quirk);
> >  
> > +/*
> > + * PCIe devices underneath Xeon6 PCIe Root Port bifurcated to 2x have slower
> > + * performance with Extended Tags and MRRS > 128B. Workaround the performance
> > + * problems by disabling Extended Tags and limiting MRRS to 128B.
> > + *
> > + * https://cdrdv2.intel.com/v1/dl/getContent/837176
> > + */
> > +static void quirk_pcie2x_no_tags_no_mrrs(struct pci_dev *pdev)
> > +{
> > +	struct pci_host_bridge *bridge = pci_find_host_bridge(pdev->bus);
> > +	u32 linkcap;
> > +
> > +	pcie_capability_read_dword(pdev, PCI_EXP_LNKCAP, &linkcap);
> > +	if (FIELD_GET(PCI_EXP_LNKCAP_MLW, linkcap) != 0x2)
> > +		return;
> > +
> > +	bridge->no_ext_tags = 1;
> > +	bridge->only_128b_mrrs = 1;
> 
> My 2 cents here. Wouldn't it work if you hardcode MRRS to 128 in PCI_EXP_DEVCTL
> here and then set pci_host_bridge::no_inc_mrrs to 1? This would avoid
> introducing an extra flag and also serve the same purpose.

Hi Mani,

Thanks for the suggestion but it won't work because this is the Root Port. 
The devices underneath it need this setting so we cannot set them to 128B 
reliable here (is there anything that guarantees those devices have been 
enumerated at this point?).

I've v3 already prepared which uses the enable device hook as suggested by 
Lukas. I'll send it soon.

-- 
 i.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ