[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87czzcdnup.fsf@nanos.tec.linutronix.de>
Date: Mon, 14 Dec 2020 23:28:14 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Shuah Khan <skhan@...uxfoundation.org>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"H. Peter Anvin" <hpa@...or.com>
Cc: "x86\@kernel.org" <x86@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Shuah Khan <skhan@...uxfoundation.org>
Subject: Re: common_interrupt: No irq handler for vector
Shuah,
On Mon, Dec 14 2020 at 13:57, Shuah Khan wrote:
> On 12/14/20 1:41 PM, Thomas Gleixner wrote:
> Here is the processor and BIOS info:
> AMD Ryzen 7 4700G with Radeon Graphics
> LENOVO ThinkCentre Embedded Controller -[O4ZCT12A-1.12]-
> LENOVO ThinkCentre BIOS Boot Block Revision 1.1C
>
>>
>>> I am bisecting to isolate. Same issue on all stables 5.4, 4.19 and
>>> so on. If it is BIOS problem I would expect to see it on 5.10-rc7
>>> and wouldn't have expected to start seeing it 5.9.9.
>>
>> Can you provide some more details, e.g. dmesg please?
>>
>
> __common_interrupt: 1.55 No irq handler for vector
> __common_interrupt: 2.55 No irq handler for vector
> __common_interrupt: 3.55 No irq handler for vector
> __common_interrupt: 4.55 No irq handler for vector
> __common_interrupt: 5.55 No irq handler for vector
> __common_interrupt: 6.55 No irq handler for vector
> __common_interrupt: 7.55 No irq handler for vector
> __common_interrupt: 8.55 No irq handler for vector
> __common_interrupt: 9.55 No irq handler for vector
> __common_interrupt: 10.55 No irq handler for vector
This _IS_ the AGESA BIOS bug.
>>>> No. It's perfectly correct in the MSI code. See further down.
>>>>
>>>> if (IS_ERR_OR_NULL(this_cpu_read(vector_irq[cfg->vector])))
>>>> this_cpu_write(vector_irq[cfg->vector], VECTOR_RETRIGGERED);
>>>>
>>>
>>> I am asking about inconsistent comments and the actual message as the
>>> comment implies if vector is VECTOR_UNUSED state, this message won't
>>> be triggered in common_interrupt. Based on that my read is the comment
>>> might be wrong if the code is correct as you are saying.
>>
>> The comment says:
>>
>> >> * anyway. If the vector is unused, then it is marked so it won't
>> >> * trigger the 'No irq handler for vector' warning in
>> >> * common_interrupt().
>>
>> If the vector is unused, then it is _marked_ so ....
>
> See the messages above.
This code has absolutely nothing to do with these messages and this code
marks the vector RETRIGGERED so the warning cannot happen if the MSI
migration causes this spurious vector to be emitted. That marking is
there _because_ the migration triggered the warning occasionally which
is unavoidable due the silliness of hardware.
The problem is that the buggy BIOS causes vector 55 which is the legacy
X86 interrupt 7 to be sent to the secondary CPUs 1-10 when they come up
the first time during boot. This has been reported to death already and
AMD confirmed that it is an AGESA BIOS bug and that it is fixed with
AGESA BIOS version 1.1.8.0.
The reason why it shows up now might be timing related, nothing else.
Thanks,
tglx
Powered by blists - more mailing lists