[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5ae774b7-f888-f201-d77d-0ab0cf0faed3@arm.com>
Date: Fri, 25 Mar 2022 10:34:13 +0000
From: Robin Murphy <robin.murphy@....com>
To: Vladimir Oltean <vladimir.oltean@....com>,
Marc Zyngier <maz@...nel.org>
Cc: "devicetree@...r.kernel.org" <devicetree@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
Rob Herring <robh+dt@...nel.org>,
Shawn Guo <shawnguo@...nel.org>, Leo Li <leoyang.li@....com>,
Biwen Li <biwen.li@....com>, "Z.Q. Hou" <zhiqiang.hou@....com>,
Kurt Kanzenbach <kurt@...utronix.de>,
Rasmus Villemoes <linux@...musvillemoes.dk>
Subject: Re: [RFC PATCH devicetree 00/10] Do something about ls-extirq
interrupt-map breakage
On 2022-03-24 19:09, Vladimir Oltean wrote:
> On Thu, Mar 24, 2022 at 06:06:51PM +0000, Marc Zyngier wrote:
>>> I was just raising this as what I thought would be a simple and
>>> non-controversial counter example to your remark "If you change something,
>>> you *must* guarantee forward *and* backward compatibility."
>>
>> If you change something *in the binding*, which was implicit in the
>> context, and makes no sense out of context.
>>
>>> Practically speaking, what has happened is that the board DT appeared in
>>> kernel N, the ls-extirq driver in kernel N+1, and the DT was updated to
>>> enable PHY interrupts in kernel N+2. That DT update practically broke
>>> kernel N from running correctly on DTs taken from kernel N+2 onwards.
>>> This is the observable behavior, we can find as many justifications for
>>> it as we wish.
>>
>> Well, you can also argue that the DT was broken at N and N+1 for not
>> describing the HW correctly and completely. No binding has changed
>> here. Your DT was incomplete, and someone fixed it for you.
>>
>> We can argue this things forever and a half. I've laid down the ground
>> rules for the stuff I maintain. If you're not happy with this, you can
>> fix it by either removing the NXP hardware from the tree, or taking
>> over from me as the irqchip maintainer. I'd be perfectly happy with
>> any (and even more, with both) of these outcomes.
>
> Ok, my intention wasn't to inflame you even though the way in which I
> presented the problem might have suggested otherwise.
>
> With my developer hat I still don't agree with you even with the
> additional clarification you've made that you were referring only to
> bindings and not to any and all DT changes. The reason being that the DT
> blob is a whole, and it doesn't matter if there's a regression because
> of a binding change or something else, you still need to be prepared to
> update it, sometimes in lockstep with the kernel, like it or not.
>
> But as a user, I just wanted to get an opinion from you what can we do
> to deal better with this situation: optional interrupt provided by
> device with missing driver, which of_irq_get() doesn't seem to understand.
FWIW, of_irq_get() absolutely understands how to handle a missing IRQ
provider driver; it returns -EPROBE_DEFER. If a caller considers the IRQ
optional, then it's up to that caller to decide how long to keep waiting
for the provider to appear until giving up and carrying on without it.
If your phy driver is making the dumb decision to wait for ever for
something which isn't critical, then you're free to fix it, or perhaps
even propose for of_irq_get() to opt in to the
driver_deferred_probe_check_state() mechanism if you believe it's a
sufficiently general case.
If a new DT with an additional new property (either on an existing
machine, or on a completely new machine which has the property from the
start) exposes a bug in a driver, that's unfortunate, but it is entirely
irrelevant to the ABI implications of changing the interpretation of an
existing property.
Robin.
Powered by blists - more mailing lists