[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGwJgaNcWA9bP4LjJRSefUhQ0eUM5xYWz8MMg7NjXgHB3+jMCQ@mail.gmail.com>
Date: Thu, 3 Feb 2022 06:29:48 +0000
From: Brent Spillner <spillner@....org>
To: Bjorn Helgaas <helgaas@...nel.org>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>, tglx@...utronix.de,
mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
x86@...nel.org, hpa@...or.com, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] x86/PCI: Improve log message when IRQ cannot be identified
On Wed, Feb 2, 2022 at 10:42 PM Bjorn Helgaas <helgaas@...nel.org> wrote:
> If your system has ACPI, I think "pci=biosirq" and "acpi=noirq" are at
> best distractions from finding the real problem.
...except when the cause is indeed buggy ACPI firmware, which is
presumably the only reason these options exist in the first place.
> This path has to do with ancient x86 and BIOS history, which I know
> very little about. If I were going to do something about these
> messages, here's what I would do. Maybe it's too aggressive; I dunno.
Well, it wouldn't require future maintenance when supported command
line options change, which is how we got the stale warnings in the
current code. I'm just not sure who it helps: the vast majority of
users with no IRQ discovery problems never see these messages so they
have no reason to care how concise they are, and they won't be getting
recommendations to try risky kernel parameters for no good reason.
Those who do hit a problem get fewer hints about how to proceed; some
might file a bug report if they can't figure it out, but others will
probably prematurely assume that their hardware "isn't supported" and
give up. Those who dig through the code and kernel_parameters.txt to
find these alternatives even without the hints are probably the ones
who would have written the best bug reports, but they don't get any
specific encouragement to do so, and might never report anything when
they do find a workaround. And I have to wonder how many PCI IRQ bug
reports would get a first response like "Hmm, that should never
happen--- trying booting with pci=biosirq and report whether that
changes anything." Even when it doesn't help, as it often won't, you
have both a data point and encouragement to file the bug report for
further investigation.
Again, this is a minor issue, and I'm not emotionally attached to any
particular solution. Your approach solves the immediate problem of
inappropriate recommendations, and if the code looked like that
already I wouldn't have proposed this patch in the first place. I just
think that if the goal is to get useful bug reports, then providing a
little bit of advice (similar to, but updated from, the current
biosirq suggestion) is more constructive than going for the tersest
possible printk, and I'm sure that log messages will be seen by more
of the potentially affected users than equivalent comments in the code
would be.
Powered by blists - more mailing lists