[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <570BC528.9050601@arm.com>
Date: Mon, 11 Apr 2016 16:39:20 +0100
From: Marc Zyngier <marc.zyngier@....com>
To: Jon Hunter <jonathanh@...dia.com>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Jason Cooper <jason@...edaemon.net>,
BenoƮt Cousson <bcousson@...libre.com>,
Tony Lindgren <tony@...mide.com>,
Rob Herring <robh+dt@...nel.org>,
Pawel Moll <pawel.moll@....com>,
Mark Rutland <mark.rutland@....com>,
Ian Campbell <ijc+devicetree@...lion.org.uk>,
Kumar Gala <galak@...eaurora.org>,
Stephen Warren <swarren@...dotorg.org>,
Thierry Reding <thierry.reding@...il.com>,
Kevin Hilman <khilman@...nel.org>,
Geert Uytterhoeven <geert@...ux-m68k.org>,
Grygorii Strashko <grygorii.strashko@...com>,
Lars-Peter Clausen <lars@...afoo.de>,
Linus Walleij <linus.walleij@...aro.org>,
linux-tegra@...r.kernel.org, linux-omap@...r.kernel.org,
devicetree@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 04/15] irqchip/gic: WARN if setting the interrupt type
fails
On 11/04/16 16:31, Jon Hunter wrote:
> Hi Mark,
>
> On 09/04/16 11:58, Marc Zyngier wrote:
>> On Thu, 17 Mar 2016 15:04:01 +0000
>> Jon Hunter <jonathanh@...dia.com> wrote:
>>
>>>
>>> On 17/03/16 14:51, Thomas Gleixner wrote:
>>>> On Thu, 17 Mar 2016, Jon Hunter wrote:
>>>>
>>>>> Setting the interrupt type for private peripheral interrupts (PPIs) may
>>>>> not be supported by a given GIC because it is IMPLEMENTATION DEFINED
>>>>> whether this is allowed. There is no way to know if setting the type is
>>>>> supported for a given GIC and so the value written is read back to
>>>>> verify it matches the desired configuration. If it does not match then
>>>>> an error is return.
>>>>>
>>>>> There are cases where the interrupt configuration read from firmware
>>>>> (such as a device-tree blob), has been incorrect and hence
>>>>> gic_configure_irq() has returned an error. This error has gone
>>>>> undetected because the error code returned was ignored but the interrupt
>>>>> still worked fine because the configuration for the interrupt could not
>>>>> be overwritten.
>>>>>
>>>>> Given that this has done undetected and we should only fail to set the
>>>>> type for PPIs whose configuration cannot be changed anyway, don't return
>>>>> an error and simply WARN if this fails. This will allows us to fix up any
>>>>> places in the kernel where we should be checking the return status and
>>>>> maintain back compatibility with firmware images that may have incorrect
>>>>> interrupt configurations.
>>>>
>>>> Though silently returning 0 is really the wrong thing to do. You can add the
>>>> warn, but why do you want to return success?
>>>
>>> Yes that would be the correct thing to do I agree. However, the problem
>>> is that if we do this, then after the patch "irqdomain: Don't set type
>>> when mapping an IRQ" is applied, we may break interrupts for some
>>> existing device-tree binaries that have bad configuration (such as omap4
>>> and tegra20/30 ... see patches 1 and 2) that have gone unnoticed. So it
>>> is a back compatibility issue.
>>>
>>> If you are wondering why these interrupts break after "irqdomain: Don't
>>> set type when mapping an IRQ", it is because today
>>> irq_create_fwspec_mapping() does not check the return code from setting
>>> the type, but if we defer setting the type until __setup_irq() which
>>> does check the return code, then all of a sudden interrupts that were
>>> working (even with bad configurations) start to fail.
>>>
>>> The reason why I opted not to return an error code from
>>> gic_configure_irq() is it really can't fail. The failure being reported
>>> does not prevent the interrupt from working, but tells you your
>>> configuration does not match the hardware setting which you cannot
>>> overwrite.
>>>
>>> So to maintain back compatibility and avoid any silent errors, I opted
>>> to make it a WARN and not return an error.
>>>
>>> If people are ok with potentially breaking interrupts for device-tree
>>> binaries with bad settings, then I am ok to return an error here.
>>
>> I think we need to phase things. Let's start with warning people for a
>> few kernel releases. Actively maintained platforms will quickly address
>> the issue (fixing their DT). As I see it, this issue seems rather
>> widespread (even kvmtool outputs a DT with the wrong triggering
>> information).
>>
>> Once we've fixed the bulk of the platforms and virtual environments, we
>> can start thinking about making it fail harder.
>
> Ok, so are you OK with this patch as-is? If so, can I add your ACK?
It depends where you plan to handle the error. Ideally, I'd keep on
returning the error (because that's the right thing to do), and move the
WARN_ON() into the core code. We'd keep on ignoring the error as we're
doing today, but we'd scream about it.
After a couple of releases, we'd turn the WARN_ON into a hard fail.
Thoughts?
M.
--
Jazz is not dead. It just smells funny...
Powered by blists - more mailing lists