[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <96085c8a-b144-4fd3-b1fb-45763b5b64a4@www.fastmail.com>
Date: Mon, 30 Nov 2020 19:22:12 +0200
From: Laurențiu Nicola <lnicola@...d.ro>
To: "Thomas Gleixner" <tglx@...utronix.de>
Cc: mingo@...nel.org, bp@...en8.de, x86@...nel.org, trivial@...nel.org,
LKML <linux-kernel@...r.kernel.org>,
"Tom Lendacky" <thomas.lendacky@....com>
Subject: Re: [PATCH] x86/irq: Lower unhandled irq error severity
On Mon, Nov 30, 2020, at 18:56, Thomas Gleixner wrote:
> Laurentiu,
>
> On Fri, Nov 27 2020 at 10:03, Laurențiu Nicola wrote:
> > On Fri, Nov 27, 2020, at 02:12, Thomas Gleixner wrote:
> >> On Thu, Nov 26 2020 at 09:47, Laurențiu Nicola wrote:
> >> > These messages are described as warnings in the MSI code.
> >>
> >> Where and what has MSI to do with these messages?
> >
> > There's a comment referring to it as a warning, but an error seemed a more appropriate severity:
> >
> > * If the vector is unused, then it is marked so it won't
> > * trigger the 'No irq handler for vector' warning in
> > * common_interrupt().
>
> That's a description for the logic in the MSI code which is required to
> _NOT_ trigger the 'No irq handler' message. If that message appears then
> something _is_ badly wrong. Either the kernel screwed up or something in
> the BIOS/firmware/hardware is bonkers.
Agreed, just pointing out that the MSI code refers to it as a warning (as opposed to a critical error).
>
> >> > Spotted because they break quiet boot on a Ryzen 5000 CPU.
> >>
> >> They don't break the boot.
> >>
> >> The machine boots fine, but having interrupts raised on a vector which
> >> is unused is really bad.
> >
> > That's right, sorry. It still boots, but it's no longer "quiet",
> > that's what I meant.
>
> Right, but surpressing that is not a solution.
I'm just downgrading it from "emergency" to "error". It will still be displayed for most users snd anyone looking in dmesg. But I'm unlikely to convince my motherboard manufacturer to fix this in the BIOS, and the errors are basically unactionable and uninformative (unlike say "can't set up page mappings" or "your CPU might be on fire" which would really imply a crash soon).
The messages themselves are only a cosmetic issue -- they replace the BIOS logo that would otherwise stay up until the display manager started.
But if you think this should really be an "emerg" message, I'm not going to insist anymore. I'm sure you have more important patches to review :-).
Thanks,
Laurențiu
Powered by blists - more mailing lists