lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <380aa860-89a4-08dc-7d39-b7b212546415@linux.intel.com>
Date: Tue, 22 Jul 2025 15:56:16 +0300 (EEST)
From: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To: Bjorn Helgaas <helgaas@...nel.org>
cc: Icenowy Zheng <uwu@...nowy.me>, Bjorn Helgaas <bhelgaas@...gle.com>, 
    Lucas De Marchi <lucas.demarchi@...el.com>, 
    Thomas Hellström <thomas.hellstrom@...ux.intel.com>, 
    Rodrigo Vivi <rodrigo.vivi@...el.com>, linux-pci@...r.kernel.org, 
    intel-xe@...ts.freedesktop.org, LKML <linux-kernel@...r.kernel.org>, 
    Han Gao <rabenda.cn@...il.com>, Vivian Wang <wangruikang@...as.ac.cn>
Subject: Re: [PATCH] PCI: hide mysterious 8MB 64-bit pref BAR on Intel Arc
 PCIe Switch

On Mon, 21 Jul 2025, Bjorn Helgaas wrote:

> [+cc Ilpo]
> 
> On Tue, Jul 22, 2025 at 01:30:57AM +0800, Icenowy Zheng wrote:
> > The upstream port device of Intel Arc series dGPUs' internal PCIe switch
> > contains a mysterious 8MB 64-bit prefetchable BAR. All reads to memory
> > mapped to that BAR returns 0xFFFFFFFF and writes have no effect.
> > 
> > When the PCI bus isn't configured by any firmware (e.g. a PCIe
> > controller solely initialized by Linux kernel), the PCI space allocation
> > algorithm of Linux will allocate the main VRAM BAR of Arc dGPU device at
> > base+0, and then the 8MB BAR at base+256M, which prevents the main VRAM
> > BAR gets resized.

__resource_resize_store() tries to release all resoures with the same 
flags as the resource to be resized. But it seems the release doesn't work 
across devices.

I don't like that flags check anyway, I'd want to replace all such black 
magic with a function that consistently determines the bridge window a 
resouce is assigned to. I've a series to that effect but it doesn't cover 
resize cases yet and it requires more testing anyway to confirm it doesn't 
change any parent windows resources get assigned to.

So IMO, the correct logic on resize would be to:

1) Get the relevant upstream bridge window
2) Release all child resource of that bridge window. But that will 
require further checks whether all those resources (from foreign PCI devs) 
can be released which might run a foul with dev lock ordering.

...So it might turn out hard to implement in practice.

> > As the functionality and performance of Arc dGPU will
> > get severely restricted with small BAR, this makes a problem.
> > 
> > Hide the mysterious 8MB BAR to Linux PCI subsystem, to allow resizing
> > the VRAM BAR to VRAM size with the Linux PCI space allocation algorithm.
> 
> There's no reason a switch upstream port should not have a BAR.  I
> suspect this BAR probably does have a legitimate purpose, and it's
> only "mysterious" because we don't know how to use it.
> 
> This sounds like it may be a deficiency in the Linux BAR assignment
> code.  Any other device could have a similar problem.  

I'm still working also with the resource fitting logic to make it consider 
resizable BARs when sizing the resource which would address this problem 
another way.

> > Signed-off-by: Icenowy Zheng <uwu@...nowy.me>
> > ---
> >  drivers/pci/quirks.c | 16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> > 
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index d7f4ee634263..df304bfec6e9 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -3650,6 +3650,22 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x37d0, quirk_broken_intx_masking);
> >  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x37d1, quirk_broken_intx_masking);
> >  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x37d2, quirk_broken_intx_masking);
> >  
> > +/*
> > + * Intel Arc dGPUs' internal switch upstream port contains a mysterious 8MB
> > + * 64-bit prefetchable BAR that blocks resize of main dGPU VRAM BAR with
> > + * Linux's PCI space allocation algorithm.
> > + */
> > +static void quirk_intel_xe_upstream(struct pci_dev *pdev)
> > +{
> > +	memset(&pdev->resource[0], 0, sizeof(pdev->resource[0]));
> 
> This doesn't touch the BAR itself, so we may be leaving the BAR
> decoding accesses, which could lead to an address conflict.  It also
> prevents a driver for the upstream port from using the BAR.
> 
> > +}
> > +/* Intel Arc A380 PCI Express Switch Upstream Port */
> > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4fa1, quirk_intel_xe_upstream);
> > +/* Intel Arc A770 PCI Express Switch Upstream Port */
> > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4fa0, quirk_intel_xe_upstream);
> > +/* Intel Arc B580 PCI Express Switch Upstream Port */
> > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0xe2ff, quirk_intel_xe_upstream);
> > +
> >  static u16 mellanox_broken_intx_devs[] = {
> >  	PCI_DEVICE_ID_MELLANOX_HERMON_SDR,
> >  	PCI_DEVICE_ID_MELLANOX_HERMON_DDR,
> > -- 
> > 2.50.1
> > 
> 

-- 
 i.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ