[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87tv2dd17z.fsf@nanos.tec.linutronix.de>
Date: Tue, 24 Mar 2020 20:03:44 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Evan Green <evgreen@...omium.org>
Cc: Mathias Nyman <mathias.nyman@...ux.intel.com>, x86@...nel.org,
linux-pci <linux-pci@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
"Ghorai\, Sukumar" <sukumar.ghorai@...el.com>,
"Amara\, Madhusudanarao" <madhusudanarao.amara@...el.com>,
"Nandamuri\, Srikanth" <srikanth.nandamuri@...el.com>
Subject: Re: MSI interrupt for xhci still lost on 5.6-rc6 after cpu hotplug
Evan Green <evgreen@...omium.org> writes:
> On Mon, Mar 23, 2020 at 5:24 PM Thomas Gleixner <tglx@...utronix.de> wrote:
>> And of course all of this is so well documented that all of us can
>> clearly figure out what's going on...
>
> I won't pretend to know what's going on, so I'll preface this by
> labeling it all as "flailing", but:
>
> I wonder if there's some way the interrupt can get delayed between
> XHCI snapping the torn value and it finding its way into the IRR. For
> instance, if xhci read this value at the start of their interrupt
> moderation timer period, that would be awful (I hope they don't do
> this). One test patch would be to carve out 8 vectors reserved for
> xhci on all cpus. Whenever you change the affinity, the assigned
> vector is always reserved_base + cpu_number. That lets you exercise
> the affinity switching code, but in a controlled manner where torn
> interrupts could be easily seen (ie hey I got an interrupt on cpu 4's
> vector but I'm cpu 2). I might struggle to write such a change, but in
> theory it's doable.
Well, the point is that we don't see a spurious interrupt on any
CPU. We added a traceprintk into do_IRQ() and that would immediately
tell us where the thing goes off into lala land. Which it didn't.
> I was alternately trying to build a theory in my head about the write
> somehow being posted and getting out of order, but I don't think that
> can happen.
If that happens then the lost XHCI interrupt is the least of your
worries.
Thanks,
tglx
Powered by blists - more mailing lists