[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <33001e60-cbfc-f114-55bf-f347f21fee9b@metux.net>
Date: Tue, 15 Dec 2020 21:12:26 +0100
From: "Enrico Weigelt, metux IT consult" <lkml@...ux.net>
To: Thomas Gleixner <tglx@...utronix.de>,
Michael Ellerman <mpe@...erman.id.au>,
"Enrico Weigelt, metux IT consult" <info@...ux.net>,
linux-kernel@...r.kernel.org
Cc: James.Bottomley@...senPartnership.com, deller@....de,
benh@...nel.crashing.org, paulus@...ba.org, jdike@...toit.com,
richard@....at, anton.ivanov@...bridgegreys.com, mingo@...hat.com,
bp@...en8.de, x86@...nel.org, hpa@...or.com,
linux-parisc@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
linux-s390@...r.kernel.org, linux-um@...ts.infradead.org
Subject: Re: [PATCH] arch: fix 'unexpected IRQ trap at vector' warnings
On 09.12.20 00:01, Thomas Gleixner wrote:
> There are a few situations why it is invoked or not:
>
> 1) The original x86 usage is not longer using it because it complains
> rightfully about a vector being raised which has no interrupt
> descriptor associated to it. So the original reason for naming it
> vector is gone long ago. It emits:
>
> pr_emerg_ratelimited("%s: %d.%u No irq handler for vector\n",
> __func__, smp_processor_id(), vector);
>
> Directly from the x86 C entry point without ever invoking that
> function. Pretty popular error message due to some AMD BIOS
> wreckage. :)
Of course, the term "vector" should be replaced by something like
"irqnr" or "virq", but I didn't have name changes within scope - just
wanted to fix the printing of that number, as i've stupled over it while
working on something different and wondered why the number differed from
what I had expected, until I seen that it prints hex instead of decimal.
But if you prefer a more complete cleanup, I'll be happy to do it.
> 3) It's invoked from __handle_domain_irq() when the 'hwirq' which is
> handed in by the caller does not resolve to a mapped Linux
> interrupt which is pretty much the same as the x86 situation above
> in #1, but it prints useless data.
>
> It prints 'irq' which is invalid but it does not print the really
> interesting 'hwirq' which was handed in by the caller and did
> not resolve.
I wouldn't say the irq-nr isn't interesting. In my particular case it
was quite what I've been looking for. But you're right, hwirq should
also be printed.
> In this case the Linux irq number is uninteresting as it is known
> to be invalid and simply is not mapped and therefore does not
> exist.
In my case it came in from generic_handle_irq(), and in this case this
irq number (IMHO) has been valid, but nobody handled it, so it went to
ack_bad_irq.
Of course, if this function is meant as a fallback to ack some not
otherwise handled IRQ on the hw, the linux irq number indeed isn't quite
helpful (unless we expect that code to do a lookup to the hw irq).
... rethinking this further ... shouldn't we also pass in even more data
(eg. irq_desc, irqchip, ...), so this function can check which hw to
actually talk to ?
> 4) It's invoked from the dummy irq chip which is installed for a
> couple of truly virtual interrupts where the invocation of
> dummy_irq_chip::irq_ack() is indicating wreckage.
>
> In that case the Linux irq number is the thing which is printed.
>
> So no. It's not just inconsistent it's in some places outright
> wrong. What we really want is:
>
> ack_bad_irq(int hwirq, int virq)
is 'int' correct here ?
BTW: I also wonder why the virq is unsigned int, while hwirq (eg. in
struct irq_data) is unsigned long. shouldn't the virtual number space
be at least as big (or even bigger) than the hw one ?
{
> if (hwirq >= 0)
> print_useful_info(hwirq);
> if (virq > 0)
> print_useful_info(virq);
> arch_try_to_ack(hwirq, virq);
> }
>
> for this to make sense. Just fixing the existing printk() to be less
> wrong is not really an improvement.
Okay, makes sense.
OTOH: since both callers (dummychip.c, handle.c) already dump out before
ack_bad_irq(), do we need to print out anything at all ?
I've also seen that many archs increase a counter (some use long, others
atomic_t) - should we also consolidate this in an arch-independent way
in handle.c (or does kstat_incr_irqs_this_cpu already do this) ?
--mtx
--
---
Hinweis: unverschlüsselte E-Mails können leicht abgehört und manipuliert
werden ! Für eine vertrauliche Kommunikation senden Sie bitte ihren
GPG/PGP-Schlüssel zu.
---
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@...ux.net -- +49-151-27565287
Powered by blists - more mailing lists