[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87mtzmmzk6.fsf@nanos.tec.linutronix.de>
Date:   Thu, 12 Nov 2020 15:15:21 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Jason Gunthorpe <jgg@...dia.com>,
        Ziyad Atiyyeh <ziyadat@...dia.com>,
        Itay Aveksis <itayav@...dia.com>,
        Moshe Shemesh <moshe@...dia.com>
Cc:     LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
        Joerg Roedel <joro@...tes.org>,
        iommu@...ts.linux-foundation.org, linux-pci@...r.kernel.org,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Marc Zyngier <maz@...nel.org>,
        David Woodhouse <dwmw2@...radead.org>
Subject: Re: REGRESSION: Re: [patch V2 00/46] x86, PCI, XEN, genirq ...: Prepare for device MSI
Jason,
(trimmed CC list a bit)
On Thu, Nov 12 2020 at 08:55, Jason Gunthorpe wrote:
> On Wed, Aug 26, 2020 at 01:16:28PM +0200, Thomas Gleixner wrote:
> They were unable to bisect further into the series because some of the
> interior commits don't boot :(
>
> When we try to load the mlx5 driver on a bare metal VF it gets this:
>
> [Thu Oct 22 08:54:51 2020] DMAR: DRHD: handling fault status reg 2
> [Thu Oct 22 08:54:51 2020] DMAR: [INTR-REMAP] Request device [42:00.2] fault index 1600 [fault reason 37] Blocked a compatibility format interrupt request
> [Thu Oct 22 08:55:04 2020] mlx5_core 0000:42:00.1 eth4: Link down
> [Thu Oct 22 08:55:11 2020] mlx5_core 0000:42:00.1 eth4: Link up
> [Thu Oct 22 08:55:54 2020] mlx5_core 0000:42:00.2: mlx5_cmd_eq_recover:264:(pid 3390): Recovered 1 EQEs on cmd_eq
> [Thu Oct 22 08:55:54 2020] mlx5_core 0000:42:00.2: wait_func_handle_exec_timeout:1051:(pid 3390): cmd0: CREATE_EQ(0Ã301) recovered after timeout
> [Thu Oct 22 08:55:54 2020] DMAR: DRHD: handling fault status reg 102
> [Thu Oct 22 08:55:54 2020] DMAR: [INTR-REMAP] Request device [42:00.2] fault index 1600 [fault reason 37] Blocked a compatibility format interrupt request
>
> If you have any idea Ziyad and Itay can run any debugging you like.
>
> I suppose it is because this series is handing out compatability
> addr/data pairs while the IOMMU is setup to only accept remap ones
> from SRIOV VFs?
So the issue seems to be that the VF device has the default irq domain
assigned and not the remapping domain. Let me stare into the code to see
how these VF devices are set up and registered with the IOMMU/remap
unit.
Thanks,
        tglx
Powered by blists - more mailing lists
 
