lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87mtzmmzk6.fsf@nanos.tec.linutronix.de>
Date:   Thu, 12 Nov 2020 15:15:21 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Jason Gunthorpe <jgg@...dia.com>,
        Ziyad Atiyyeh <ziyadat@...dia.com>,
        Itay Aveksis <itayav@...dia.com>,
        Moshe Shemesh <moshe@...dia.com>
Cc:     LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
        Joerg Roedel <joro@...tes.org>,
        iommu@...ts.linux-foundation.org, linux-pci@...r.kernel.org,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Marc Zyngier <maz@...nel.org>,
        David Woodhouse <dwmw2@...radead.org>
Subject: Re: REGRESSION: Re: [patch V2 00/46] x86, PCI, XEN, genirq ...: Prepare for device MSI

Jason,

(trimmed CC list a bit)

On Thu, Nov 12 2020 at 08:55, Jason Gunthorpe wrote:
> On Wed, Aug 26, 2020 at 01:16:28PM +0200, Thomas Gleixner wrote:
> They were unable to bisect further into the series because some of the
> interior commits don't boot :(
>
> When we try to load the mlx5 driver on a bare metal VF it gets this:
>
> [Thu Oct 22 08:54:51 2020] DMAR: DRHD: handling fault status reg 2
> [Thu Oct 22 08:54:51 2020] DMAR: [INTR-REMAP] Request device [42:00.2] fault index 1600 [fault reason 37] Blocked a compatibility format interrupt request
> [Thu Oct 22 08:55:04 2020] mlx5_core 0000:42:00.1 eth4: Link down
> [Thu Oct 22 08:55:11 2020] mlx5_core 0000:42:00.1 eth4: Link up
> [Thu Oct 22 08:55:54 2020] mlx5_core 0000:42:00.2: mlx5_cmd_eq_recover:264:(pid 3390): Recovered 1 EQEs on cmd_eq
> [Thu Oct 22 08:55:54 2020] mlx5_core 0000:42:00.2: wait_func_handle_exec_timeout:1051:(pid 3390): cmd0: CREATE_EQ(0×301) recovered after timeout
> [Thu Oct 22 08:55:54 2020] DMAR: DRHD: handling fault status reg 102
> [Thu Oct 22 08:55:54 2020] DMAR: [INTR-REMAP] Request device [42:00.2] fault index 1600 [fault reason 37] Blocked a compatibility format interrupt request
>
> If you have any idea Ziyad and Itay can run any debugging you like.
>
> I suppose it is because this series is handing out compatability
> addr/data pairs while the IOMMU is setup to only accept remap ones
> from SRIOV VFs?

So the issue seems to be that the VF device has the default irq domain
assigned and not the remapping domain. Let me stare into the code to see
how these VF devices are set up and registered with the IOMMU/remap
unit.

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ