lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250325223752.f5tjazbpbblgppyz@amd.com>
Date: Tue, 25 Mar 2025 17:37:52 -0500
From: Michael Roth <michael.roth@....com>
To: "Aithal, Srikanth" <sraithal@....com>
CC: <linux-pci@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<bhelgaas@...gle.com>, <sfr@...b.auug.org.au>,
	<syzkaller-bugs@...glegroups.com>, <linux-next@...r.kernel.org>, "Roger Pau
 Monne" <roger.pau@...rix.com>, Juergen Gross <jgross@...e.com>
Subject: Re: [syzbot] [pci?] linux-next test error: general protection fault
 in msix_capability_init

Also able to reproduce this trace on every boot with a basic KVM guest on an
EPYC Milan system using next-20250325 for both host/guest.

A bisect of commits to drivers/pci/msi seems to indicate the following commit
is the source of the regression:

  commit d9f2164238d814d119e8c979a3579d1199e271bb
  Author: Roger Pau Monne <roger.pau@...rix.com>
  Date:   Wed Feb 19 10:20:57 2025 +0100
  
      PCI/MSI: Convert pci_msi_ignore_mask to per MSI domain flag
      
      Setting pci_msi_ignore_mask inhibits the toggling of the mask bit for both
      MSI and MSI-X entries globally, regardless of the IRQ chip they are using.
      Only Xen sets the pci_msi_ignore_mask when routing physical interrupts over
      event channels, to prevent PCI code from attempting to toggle the maskbit,
      as it's Xen that controls the bit.
      
      However, the pci_msi_ignore_mask being global will affect devices that use
      MSI interrupts but are not routing those interrupts over event channels
      (not using the Xen pIRQ chip).  One example is devices behind a VMD PCI
      bridge.  In that scenario the VMD bridge configures MSI(-X) using the
      normal IRQ chip (the pIRQ one in the Xen case), and devices behind the
      bridge configure the MSI entries using indexes into the VMD bridge MSI
      table.  The VMD bridge then demultiplexes such interrupts and delivers to
      the destination device(s).  Having pci_msi_ignore_mask set in that scenario
      prevents (un)masking of MSI entries for devices behind the VMD bridge.
      
      Move the signaling of no entry masking into the MSI domain flags, as that
      allows setting it on a per-domain basis.  Set it for the Xen MSI domain
      that uses the pIRQ chip, while leaving it unset for the rest of the
      cases.
      
      Remove pci_msi_ignore_mask at once, since it was only used by Xen code, and
      with Xen dropping usage the variable is unneeded.
      
      This fixes using devices behind a VMD bridge on Xen PV hardware domains.
      
      Albeit Devices behind a VMD bridge are not known to Xen, that doesn't mean
      Linux cannot use them.  By inhibiting the usage of
      VMD_FEAT_CAN_BYPASS_MSI_REMAP and the removal of the pci_msi_ignore_mask
      bodge devices behind a VMD bridge do work fine when use from a Linux Xen
      hardware domain.  That's the whole point of the series.
      
      Signed-off-by: Roger Pau Monné <roger.pau@...rix.com>
      Reviewed-by: Thomas Gleixner <tglx@...utronix.de>
      Acked-by: Juergen Gross <jgross@...e.com>
      Acked-by: Bjorn Helgaas <bhelgaas@...gle.com>
      Message-ID: <20250219092059.90850-4-roger.pau@...rix.com>
      Signed-off-by: Juergen Gross <jgross@...e.com>

Thanks,

Mike

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ