lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250207220036.GA1018004@bhelgaas>
Date: Fri, 7 Feb 2025 16:00:36 -0600
From: Bjorn Helgaas <helgaas@...nel.org>
To: Marek Marczykowski-Górecki <marmarek@...isiblethingslab.com>
Cc: Jan Beulich <jbeulich@...e.com>, Bjorn Helgaas <bhelgaas@...gle.com>,
	Jürgen Groß <jgross@...e.com>,
	Roger Pau Monné <roger.pau@...rix.com>,
	Boris Ostrovsky <boris.ostrovsky@...cle.com>,
	xen-devel <xen-devel@...ts.xenproject.org>,
	linux-kernel@...r.kernel.org, regressions@...ts.linux.dev,
	Felix Fietkau <nbd@....name>, Lorenzo Bianconi <lorenzo@...nel.org>,
	Ryder Lee <ryder.lee@...iatek.com>
Subject: Re: Config space access to Mediatek MT7922 doesn't work after device
 reset in Xen PV dom0 (regression, Linux 6.12)

On Wed, Feb 05, 2025 at 11:14:17PM +0100, Marek Marczykowski-Górecki wrote:
> On Thu, Jan 30, 2025 at 03:31:23PM -0600, Bjorn Helgaas wrote:
> > On Thu, Jan 30, 2025 at 10:30:33AM +0100, Jan Beulich wrote:
> > > On 30.01.2025 05:55, Marek Marczykowski-Górecki wrote:
> > > > I've added logging of all config read/write to this device. Full log at
> > > > [1].
> > > ...

> ... Generally it looks like this device has broken FLR, and the
> reset works due to the fallback to the secondary bus reset on
> timeout. I repeated the test with my additional "&&
> !PCI_POSSIBLE_ERROR(id)" and I got this:
> [2] https://gist.github.com/marmarek/db0808702131b69ea2f66f339a55d71b
> 
> The first log is with xen, and the second with native linux (and
> added PCI config space logging in drivers/pci/access.c).

This is just to annotate these logs.  Correct me if you see something
wrong.

Both logs include this patch:

  @@ -1297,7 +1297,8 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
                  if (root && root->config_rrs_sv) {
                          pci_read_config_dword(dev, PCI_VENDOR_ID, &id);
  -                     if (!pci_bus_rrs_vendor_id(id))
  +                     if (!pci_bus_rrs_vendor_id(id) &&
  +                         !PCI_POSSIBLE_ERROR(id))
                                  break;

I think both logs show this sequence:

  - Initiate FLR on 01:00.0

  - In pci_dev_wait(), poll PCI_VENDOR_ID, looking for something other
    than 0x0001 (which would indicate RRS response) or 0xffff (from
    patch above).

  - Time out after ~70 seconds and return -ENOTTY.

  - Attempt Secondary Bus Reset using 00:02.2, the Root Port leading
    to 01:00.0.

  - Successfully read PCI_VENDOR_ID.

  - Looks the same, whether linux is running natively or on top of
    Xen.

Relevant devices (from mediatek-debug-6.12-patch2+bridgelog.log):

  00:02.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Phoenix GPP Bridge
    Bus: primary=00, secondary=01, subordinate=01, sec-latency=0

  01:00.0 Network controller: MEDIATEK Corp. MT7922 802.11ax PCI Express Wireless Network Adapter
    Capabilities: [80] Express (v2) Endpoint, IntMsgNum 0

>From mediatek-debug-6.12-patch2+bridgelog.log (from [2] above):

  [anaconda root@...t-12 /]# time echo 1 > /sys/bus/pci/devices/0000:01:00.0/reset
  (XEN) d0v3 conf write cf8 0x80010088 bytes 2 offset 0 data 0xa910      <-- set 01:00.0 FLR
  (XEN) d0v3 conf read cf8 0x80010000 bytes 4 offset 0 data 0xffffffff
  ...
  (XEN) d0v4 conf read cf8 0x80010000 bytes 4 offset 0 data 0xffffffff
  ...
  (XEN) d0v4 conf read cf8 0x8000123c bytes 2 offset 2 data 0x2          (0x3c + offset 2 = 0x3e)
  (XEN) d0v4 conf write cf8 0x8000123c bytes 2 offset 2 data 0x42        <-- set 00:02.2 SBR
  (XEN) d0v4 conf write cf8 0x8000123c bytes 2 offset 2 data 0x2
  ...
  (XEN) d0v4 conf read cf8 0x80010000 bytes 4 offset 0 data 0x61614c3    <-- 01:00.0 VID/DID
  ...
  real    1m10.825s

>From mediatek-debug-native-6.12-patch2+bridgelog.log (also from [2]
above):

  [anaconda root@...t-12 ~]# time echo 1 > /sys/bus/pci/devices/0000:01:00.0/reset
  [  240.449215] pciback 0000:01:00.0: resetting
  [  240.450709] PCI: write bus 0x1 devfn 0x0 pos 0x88 size 2 value 0xa910   <-- set 01:00.0 FLR
  [  240.553264] PCI: read bus 0x1 devfn 0x0 pos 0x0 size 4 value 0xffffffff
  ...
  [  309.481728] PCI: read bus 0x1 devfn 0x0 pos 0x0 size 4 value 0xffffffff
  [  309.481747] pciback 0000:01:00.0: not ready 65535ms after FLR; giving up
  ...
  [  309.482667] PCI: read bus 0x0 devfn 0x12 pos 0x3e size 2 value 0x2      PCI_BRIDGE_CONTROL
  [  309.482670] PCI: write bus 0x0 devfn 0x12 pos 0x3e size 2 value 0x42    <-- set 00:02.2 SBR
  [  309.485184] PCI: write bus 0x0 devfn 0x12 pos 0x3e size 2 value 0x2

  ...
  [  309.617782] PCI: read bus 0x1 devfn 0x0 pos 0x0 size 4 value 0x61614c3  <-- 01:00.0 VID/DID
  [  309.629234] pciback 0000:01:00.0: reset done

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ