[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2fadf33d-8487-94c2-4460-2a20fdb2ea12@canonical.com>
Date: Tue, 5 Oct 2021 18:02:24 +1300
From: Matthew Ruffell <matthew.ruffell@...onical.com>
To: Alex Williamson <alex.williamson@...hat.com>
Cc: linux-pci@...r.kernel.org, lkml <linux-kernel@...r.kernel.org>,
kvm@...r.kernel.org, nathan.langford@...lesunifiedtechnologies.com
Subject: Re: [PROBLEM] Frequently get "irq 31: nobody cared" when passing
through 2x GPUs that share same pci switch via vfio
Hi Alex,
Have you had an opportunity to have a look at this a bit deeper?
On 16/09/21 4:32 am, Alex Williamson wrote:
>
> Adding debugging to the vfio-pci interrupt handler, it's correctly
> deferring the interrupt as the GPU device is not identifying itself as
> the source of the interrupt via the status register. In fact, setting
> the disable INTx bit in the GPU command register while the interrupt
> storm occurs does not stop the interrupts.
>
> The interrupt storm does seem to be related to the bus resets, but I
> can't figure out yet how multiple devices per switch factors into the
> issue. Serializing all bus resets via a mutex doesn't seem to change
> the behavior.
>
> I'm still investigating, but if anyone knows how to get access to the
> Broadcom datasheet or errata for this switch, please let me know.
We have managed to obtain a recent errata for this switch, and it
doesn't
mention any interrupt storms with nested switches. What would
I be looking for
in the errata? I cannot share our copy, sorry.
Is there anything that we can do to help?
Thanks,
Matthew
Powered by blists - more mailing lists