lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87czzcdnup.fsf@nanos.tec.linutronix.de>
Date:   Mon, 14 Dec 2020 23:28:14 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Shuah Khan <skhan@...uxfoundation.org>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "H. Peter Anvin" <hpa@...or.com>
Cc:     "x86\@kernel.org" <x86@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Shuah Khan <skhan@...uxfoundation.org>
Subject: Re: common_interrupt: No irq handler for vector

Shuah,

On Mon, Dec 14 2020 at 13:57, Shuah Khan wrote:
> On 12/14/20 1:41 PM, Thomas Gleixner wrote:
> Here is the processor and BIOS info:
> AMD Ryzen 7 4700G with Radeon Graphics
> LENOVO ThinkCentre Embedded Controller -[O4ZCT12A-1.12]-
> LENOVO ThinkCentre BIOS Boot Block Revision 1.1C
>
>> 
>>> I am bisecting to isolate. Same issue on all stables 5.4, 4.19 and
>>> so on. If it is BIOS problem I would expect to see it on 5.10-rc7
>>> and wouldn't have expected to start seeing it 5.9.9.
>> 
>> Can you provide some more details, e.g. dmesg please?
>> 
>
> __common_interrupt: 1.55 No irq handler for vector
> __common_interrupt: 2.55 No irq handler for vector
> __common_interrupt: 3.55 No irq handler for vector
> __common_interrupt: 4.55 No irq handler for vector
> __common_interrupt: 5.55 No irq handler for vector
> __common_interrupt: 6.55 No irq handler for vector
> __common_interrupt: 7.55 No irq handler for vector
> __common_interrupt: 8.55 No irq handler for vector
> __common_interrupt: 9.55 No irq handler for vector
> __common_interrupt: 10.55 No irq handler for vector

This _IS_ the AGESA BIOS bug.

>>>> No. It's perfectly correct in the MSI code. See further down.
>>>>
>>>> 	if (IS_ERR_OR_NULL(this_cpu_read(vector_irq[cfg->vector])))
>>>> 		this_cpu_write(vector_irq[cfg->vector], VECTOR_RETRIGGERED);
>>>>
>>>
>>> I am asking about inconsistent comments and the actual message as the
>>> comment implies if vector is VECTOR_UNUSED state, this message won't
>>> be triggered in common_interrupt. Based on that my read is the comment
>>> might be wrong if the code is correct as you are saying.
>> 
>> The comment says:
>> 
>>    >>    * anyway. If the vector is unused, then it is marked so it won't
>>    >>    * trigger the 'No irq handler for vector' warning in
>>    >>    * common_interrupt().
>> 
>>    If the vector is unused, then it is _marked_ so ....
>
> See the messages above.

This code has absolutely nothing to do with these messages and this code
marks the vector RETRIGGERED so the warning cannot happen if the MSI
migration causes this spurious vector to be emitted. That marking is
there _because_ the migration triggered the warning occasionally which
is unavoidable due the silliness of hardware.

The problem is that the buggy BIOS causes vector 55 which is the legacy
X86 interrupt 7 to be sent to the secondary CPUs 1-10 when they come up
the first time during boot. This has been reported to death already and
AMD confirmed that it is an AGESA BIOS bug and that it is fixed with
AGESA BIOS version 1.1.8.0.

The reason why it shows up now might be timing related, nothing else.

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ