[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <wr2z74wsqhitisgp4qsfrmuvvhw3cpp3bdzkp5batawv6btfyd@xcyhug7jyfxg>
Date: Thu, 8 Aug 2024 15:56:10 -0500
From: Andrew Halaney <ahalaney@...hat.com>
To: Rob Herring <robh@...nel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>,
Siddharth Vadapalli <s-vadapalli@...com>, bhelgaas@...gle.com, lpieralisi@...nel.org, kw@...ux.com,
vigneshr@...com, kishon@...nel.org, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org, stable@...r.kernel.org,
srk@...com
Subject: Re: [PATCH] PCI: j721e: Set .map_irq and .swizzle_irq to NULL
On Mon, Aug 05, 2024 at 01:05:14PM GMT, Rob Herring wrote:
> On Mon, Aug 5, 2024 at 10:45 AM Manivannan Sadhasivam
> <manivannan.sadhasivam@...aro.org> wrote:
> >
> > On Mon, Aug 05, 2024 at 10:01:37AM -0600, Rob Herring wrote:
> > > On Fri, Jul 26, 2024 at 5:56 AM Manivannan Sadhasivam
> > > <manivannan.sadhasivam@...aro.org> wrote:
> > > >
> > > > On Thu, Jul 25, 2024 at 01:50:16PM +0530, Siddharth Vadapalli wrote:
> > > > > On Thu, Jul 25, 2024 at 09:50:01AM +0530, Manivannan Sadhasivam wrote:
> > > > > > On Wed, Jul 24, 2024 at 09:49:21PM +0530, Manivannan Sadhasivam wrote:
> > > > > > > On Wed, Jul 24, 2024 at 12:20:48PM +0530, Siddharth Vadapalli wrote:
> > > > > > > > Since the configuration of Legacy Interrupts (INTx) is not supported, set
> > > > > > > > the .map_irq and .swizzle_irq callbacks to NULL. This fixes the error:
> > > > > > > > of_irq_parse_pci: failed with rc=-22
> > > > > > > > due to the absence of Legacy Interrupts in the device-tree.
> > > > > > > >
> > > > > > >
> > > > > > > Do you really need to set 'swizzle_irq' to NULL? pci_assign_irq() will bail out
> > > > > > > if 'map_irq' is set to NULL.
> > > > > > >
> > > > > >
> > > > > > Hold on. The errono of of_irq_parse_pci() is not -ENOENT. So the INTx interrupts
> > > > > > are described in DT? Then why are they not supported?
> > > > >
> > > > > No, the INTx interrupts are not described in the DT. It is the pcieport
> > > > > driver that is attempting to setup INTx via "of_irq_parse_and_map_pci()"
> > > > > which is the .map_irq callback. The sequence of execution leading to the
> > > > > error is as follows:
> > > > >
> > > > > pcie_port_probe_service()
> > > > > pci_device_probe()
> > > > > pci_assign_irq()
> > > > > hbrg->map_irq
> > > > > of_pciof_irq_parse_and_map_pci()
> > > > > of_irq_parse_pci()
> > > > > of_irq_parse_raw()
> > > > > rc = -EINVAL
> > > > > ...
> > > > > [DEBUG] OF: of_irq_parse_raw: ipar=/bus@...000/interrupt-controller@...0000, size=3
> > > > > if (out_irq->args_count != intsize)
> > > > > goto fail
> > > > > return rc
> > > > >
> > > > > The call to of_irq_parse_raw() results in the Interrupt-Parent for the
> > > > > PCIe node in the device-tree being found via of_irq_find_parent(). The
> > > > > Interrupt-Parent for the PCIe node for MSI happens to be GIC_ITS:
> > > > > msi-map = <0x0 &gic_its 0x0 0x10000>;
> > > > > and the parent of GIC_ITS is:
> > > > > gic500: interrupt-controller@...0000
> > > > > which has the following:
> > > > > #interrupt-cells = <3>;
> > > > >
> > > > > The "size=3" portion of the DEBUG print above corresponds to the
> > > > > #interrupt-cells property above. Now, "out_irq->args_count" is set to 1
> > > > > as __assumed__ by of_irq_parse_pci() and mentioned as a comment in that
> > > > > function:
> > > > > /*
> > > > > * Ok, we don't, time to have fun. Let's start by building up an
> > > > > * interrupt spec. we assume #interrupt-cells is 1, which is standard
> > > > > * for PCI. If you do different, then don't use that routine.
> > > > > */
> > > > >
> > > > > In of_irq_parse_pci(), since the PCIe-Port driver doesn't have a
> > > > > device-tree node, the following doesn't apply:
> > > > > dn = pci_device_to_OF_node(pdev);
> > > > > and we skip to the __assumption__ above and proceed as explained in the
> > > > > execution sequence above.
> > > > >
> > > > > If the device-tree nodes for the INTx interrupts were present, the
> > > > > "ipar" sequence to find the interrupt parent would be skipped and we
> > > > > wouldn't end up with the -22 (-EINVAL) error code.
> > > > >
> > > > > I hope this clarifies the relation between the -22 error code and the
> > > > > missing device-tree nodes for INTx.
> > > > >
> > > >
> > > > Thanks for explaining the logic. Still I think the logic is flawed. Because the
> > > > parent (host bridge) doesn't have 'interrupt-map', which means INTx is not
> > > > supported. But parsing one level up to the GIC node and not returning -ENOENT
> > > > doesn't make sense to me.
> > > >
> > > > Rob, what is your opinion on this behavior?
> > >
> > > Not sure I get the question. How should we handle/determine no INTx? I
> > > suppose that's either based on the platform (as this patch did) or by
> >
> > Platform != driver. Here the driver is making the call, but the platform
> > capability should come from DT, no? I don't like the idea of disabling INTx in
> > the driver because, the driver may support multiple SoCs and these capability
> > may differ between them. So the driver will end up just hardcoding the info
> > which is already present in DT :/
>
> Let me rephrase it to "a decision made within the driver" (vs.
> globally decided). That could be hardcoded (for now) or as match data
> based on compatible.
>
> > Moreover, the issue I'm seeing is, even if the platform doesn't support INTx (as
> > described by DT in this case), of_irq_parse_pci() doesn't report correct
> > error/log. So of_irq_parse_pci() definitely needs a fixup.
>
> Possibly. What's correct here?
>
> There was some rework in 6.11 of the interrupt parsing. So it is
> possible something changed here. There's also this issue still
> pending:
>
> https://lore.kernel.org/all/2046da39e53a8bbca5166e04dfe56bd5.squirrel@_/
>
> > > or by
> > > failing to parse the interrupts. The interrupt parsing code is pretty
> > > tricky as it has to deal with some ancient DTs, so I'm a little
> > > hesitant to rely on that failing. Certainly I wouldn't rely on a
> > > specific errno value. The downside to doing that is also if someone
> > > wants interrupts, but has an error in their DT, then all we can do is
> > > print 'INTx not supported' or something. So we couldn't fail probe as
> > > the common code wouldn't be able to distinguish. I suppose we could
> > > just check for 'interrupt-map' present in the host bridge node or not.
> >
> > Yeah, as simple as that. But I don't know if that is globally applicable to
> > all platforms.
>
> There's a lot of history and the interrupt parsing is fragile due to
> all the "interesting" DT interrupt hierarchies. So while I think it
> would work, that's just a guess. I'm open to trying it and seeing.
Would something like this be what you're imagining? If so I can post a
patch if this patch is a dead end:
diff --git a/drivers/pci/of.c b/drivers/pci/of.c
index dacea3fc5128..4e4ecaa95599 100644
--- a/drivers/pci/of.c
+++ b/drivers/pci/of.c
@@ -512,6 +512,10 @@ static int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *
if (ppnode == NULL) {
rc = -EINVAL;
goto err;
+ } else if (!of_get_property(ppnode, "interrupt-map", NULL)) {
+ /* No interrupt-map on a host bridge means we're done here */
+ rc = -ENOENT;
+ goto err;
}
} else {
/* We found a P2P bridge, check if it has a node */
I must admit that you being nervous has me being nervous since I'm not all
that familiar with PCI... but if y'all think this is ok then I'm for it.
I'm sure I'm not picturing all the cases here so would appreciate
some scrutiny.
You still end up with warnings, which kind of sucks, since as I
understand it the lack of INTx interrupts on this platform is
*intentional*:
[ 3.342548] pci_bus 0000:00: 2-byte config write to 0000:00:00.0 offset 0x4 may corrupt adjacent RW1C bits
[ 3.346716] pcieport 0000:00:00.0: of_irq_parse_pci: no interrupt-map found, INTx interrupts not available
[ 3.346721] PCI: OF: of_irq_parse_pci: possibly some PCI slots don't have level triggered interrupts capability
You could have a combo of both this patch (to indicate that a specific driver (even further
limited to a match data based on compatible) doesn't support these) as well as
the above diff (to improve the message printed in the situation where a driver
*does* claim to support these interrupts but fails to describe them properly).
Am I barking up the right tree? If so I'll submit a proper patch
independent of this (and depending on your views we can continue with v2
of this patch too, or not).
Thanks,
Andrew
Powered by blists - more mailing lists