[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z4U7KoU1iVOlGTcw@lizhi-Precision-Tower-5810>
Date: Mon, 13 Jan 2025 11:11:22 -0500
From: Frank Li <Frank.li@....com>
To: Marc Zyngier <maz@...nel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>,
Krzysztof Wilczyński <kw@...ux.com>,
Kishon Vijay Abraham I <kishon@...nel.org>,
Bjorn Helgaas <bhelgaas@...gle.com>, Arnd Bergmann <arnd@...db.de>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"Rafael J. Wysocki" <rafael@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Anup Patel <apatel@...tanamicro.com>, linux-kernel@...r.kernel.org,
linux-pci@...r.kernel.org, imx@...ts.linux.dev,
Niklas Cassel <cassel@...nel.org>, dlemoal@...nel.org,
jdmason@...zu.us, linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH v13 4/9] irqchip/gic-v3-its: Add
DOMAIN_BUS_DEVICE_PCI_EP_MSI support
On Mon, Jan 06, 2025 at 01:20:17PM -0500, Frank Li wrote:
> On Mon, Jan 06, 2025 at 05:13:10PM +0000, Marc Zyngier wrote:
> > On Mon, 06 Jan 2025 16:38:02 +0000,
> > Frank Li <Frank.li@....com> wrote:
> > >
> > > On Thu, Dec 19, 2024 at 12:02:02PM -0500, Frank Li wrote:
> > > > On Thu, Dec 19, 2024 at 10:52:30AM +0000, Marc Zyngier wrote:
> > > > > On Wed, 18 Dec 2024 23:08:39 +0000,
> > > > > Frank Li <Frank.Li@....com> wrote:
> > > > > >
> > > > > > ┌────────────────────────────────┐
> > > > > > │ │
> > > > > > │ PCI Endpoint Controller │
> > > > > > │ │
> > > > > > │ ┌─────┐ ┌─────┐ ┌─────┐ │
> > > > > > PCI Bus │ │ │ │ │ │ │ │
> > > > > > ─────────► │ │Func1│ │Func2│ ... │Func │ │
> > > > > > Doorbell │ │ │ │ │ │<n> │ │
> > > > > > │ │ │ │ │ │ │ │
> > > > > > │ └──┬──┘ └──┬──┘ └──┬──┘ │
> > > > > > │ │ │ │ │
> > > > > > └──────┼────────┼───────────┼────┘
> > > > > > │ │ │
> > > > > > ▼ ▼ ▼
> > > > > > ┌────────────────────────┐
> > > > > > │ MSI Controller │
> > > > > > └────────────────────────┘
> > > > > >
> > > > > > Add domain DOMAIN_BUS_DEVICE_PCI_EP_MSI to allocate MSI domain for Endpoint
> > > > > > function in PCI Endpoint (EP) controller, So PCI Root Complex (RC) can
> > > > > > write MSI message to MSI controller to trigger doorbell IRQ for difference
> > > > > > EP functions.
> > > > > >
> > > > > > Signed-off-by: Frank Li <Frank.Li@....com>
> > > > > > ---
> > > > > > change from v12 to v13
> > > > > > - new patch
> > > > >
> > > > > This might be v13, but after all this time, I have no idea what you
> > > > > are trying to do. You keep pasting this non-ASCII drawing in commit
> > > > > messages, but I still have no idea what this PCI Bus Doorbell
> > > > > represents.
> > > >
> > > > PCI Bus/Doorbell is two words. Basic over picture is a PCI EP devices (such
> > > > as imx95), which run linux and PCI Endpoint framework. i.MX95 connect to
> > > > PCI Host, such as PC (x86).
> > > >
> > > > i.MX95 can use standard PCI MSI framework to issue a irq to X86. but there
> > > > are not reverse direction. X86 try write some MMIO register ( mapped PCI
> > > > bar0). But i.MX95 don't know it have been modified. So currently solution
> > > > is create a polling thread to check every 10ms.
> > > >
> > > > So this patches try resolve this problem at the platform, which have MSI
> > > > controller such as ITS.
> > > >
> > > > after this patches, i.MX95 can create a PCI Bar1, which map to MSI
> > > > controller register space, when X86 write data to Bar1 (call as doorbell),
> > > > a irq will be triggered at i.MX95.
> > > >
> > > > Doorbell in diagram means 'push doorbell' (write data to bar<n>).
> > > >
> > > > >
> > > > > I appreciate the knowledge shortage is on my end, but it would
> > > > > definitely help if someone would take the time to explain what this is
> > > > > all about.
> > > >
> > > > I am not sure if diagram in corver letter can help this, or above
> > > > descriptions is enough.
> > > >
> > > >
> > > > ┌────────────┐ ┌───────────────────────────────────┐ ┌────────────────┐
> > > > │ │ │ │ │ │
> > > > │ │ │ PCI Endpoint │ │ PCI Host │
> > > > │ │ │ │ │ │
> > > > │ │◄──┤ 1.platform_msi_domain_alloc_irqs()│ │ │
> > > > │ │ │ │ │ │
> > > > │ MSI ├──►│ 2.write_msi_msg() ├──►├─BAR<n> │
> > > > │ Controller │ │ update doorbell register address│ │ │
> > > > │ │ │ for BAR │ │ │
> > > > │ │ │ │ │ 3. Write BAR<n>│
> > > > │ │◄──┼───────────────────────────────────┼───┤ │
> > > > │ │ │ │ │ │
> > > > │ ├──►│ 4.Irq Handle │ │ │
> > > > │ │ │ │ │ │
> > > > │ │ │ │ │ │
> > > > └────────────┘ └───────────────────────────────────┘ └────────────────┘
> > > > (* some detail have been changed and don't affect understand overall
> > > > picture)
> > > >
> > > > >
> > > > > From what I gather, the ITS is actually on an end-point, and get
> > > > > writes from the host, but that doesn't answer much.
> > > >
> > > > Yes, baisc it is correct. PCI RC -> PCIe Bus TLP -> PCI Endpoint Ctrl ->
> > > > AXI transaction -> ITS MMIO map register -> CPU IRQ.
> > > >
> > > > The major problem how to distingiush from difference PCI Endpoint function
> > > > driver. There are have many EP functions as much as 8, which quite similar
> > > > standard PCI, one PCIe device can have 8 physical functions.
> > > >
> > > > >
> > > > > > ---
> > > > > > drivers/irqchip/irq-gic-v3-its-msi-parent.c | 19 ++++++++++++++++++-
> > > > > > 1 file changed, 18 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/drivers/irqchip/irq-gic-v3-its-msi-parent.c b/drivers/irqchip/irq-gic-v3-its-msi-parent.c
> > > > > > index b2a4b67545b82..16e7d53f0b133 100644
> > > > > > --- a/drivers/irqchip/irq-gic-v3-its-msi-parent.c
> > > > > > +++ b/drivers/irqchip/irq-gic-v3-its-msi-parent.c
> > > > > > @@ -5,6 +5,7 @@
> > > > > > // Copyright (C) 2022 Intel
> > > > > >
> > > > > > #include <linux/acpi_iort.h>
> > > > > > +#include <linux/pci-ep-msi.h>
> > > > > > #include <linux/pci.h>
> > > > > >
> > > > > > #include "irq-gic-common.h"
> > > > > > @@ -173,6 +174,19 @@ static int its_pmsi_prepare(struct irq_domain *domain, struct device *dev,
> > > > > > return its_pmsi_prepare_devid(domain, dev, nvec, info, dev_id);
> > > > > > }
> > > > > >
> > > > > > +static int its_pci_ep_msi_prepare(struct irq_domain *domain, struct device *dev,
> > > > > > + int nvec, msi_alloc_info_t *info)
> > > > > > +{
> > > > > > + u32 dev_id;
> > > > > > + int ret;
> > > > > > +
> > > > > > + ret = pci_epf_msi_domain_get_msi_rid(dev, &dev_id);
> > > > >
> > > > > What this doesn't express is *how* are the writes conveyed to the ITS.
> > > > > Specifically, the DevID is normally sampled as sideband information at
> > > > > during the write transaction.
> > > >
> > > > Like PCI host, there msi-map in dts file, which descript how map PCI RID
> > > > to DevID, such as
> > > > msi-map = <0 $its 0x80 8>;
> > > >
> > > > This informtion should be descripted in DTS or ACPI ...
> > > >
> > > > >
> > > > > Obviously, you can't do that over PCI. So there is a lot of
> > > > > undisclosed assumption about how the ITS is integrated, and how it
> > > > > samples the DevID.
> > > >
> > > > Yes, it should be platform PCI endpoint ctrl driver jobs. Platform EP
> > > > driver should implement this type of covert. Such as i.MX95, there are
> > > > hardware call LUT in PCI ctrl, which can convert PCI' request ID to DevID
> > > > here.
> > > >
> > > > On going patch may help understand these
> > > > https://lore.kernel.org/linux-pci/20241210-imx95_lut-v8-0-2e730b2e5fde@nxp.com/
> > > >
> > > > If use latest ITS MSI64 should be simple, only need descript it at DTS
> > > > (I have not hardware to test this case yet).
> > > > pci-ep {
> > > > ...
> > > > msi-map = <0 &its, 0x<8_0000, 0xff>;
> > > > ^, ctrl ID.
> > > > msi-mask = <0xff>;
> > > > ...
> > > > }
>
> Bookmark 1
>
> > > >
> > > > >
> > > > > My conclusion is that this is not as generic as it seems to be. It is
> > > > > definitely tied to implementation-specific behaviours, none of which
> > > > > are explained.
> > > >
> > > > Compared to standard PCI MSI, which also have implementation-specific
> > > > behaviours, which convert PCI request ID to DevID Or stream ID.
> > > > https://lore.kernel.org/linux-pci/20241210-imx95_lut-v8-0-2e730b2e5fde@nxp.com/
> > > > (I have struggle this for almost one year for this implementation-specific
> > > > part)
> > > >
> > > > Well defined and mature PCI standard, MSI still need two parts, common part
> > > > and "implementation-specific" part.
> > > >
> > > > Common part of standard PCI is at several place, such its driver/msi
> > > > libary/ kernel msi code ...
> > > >
> > > > "implementation-specific" part is in PCI host bridge driver, such as
> > > > drivers/pci/controller/dwc/pcie-qcom.c
> > > >
> > > > This solution already test by Tested-by: Niklas Cassel <cassel@...nel.org>
> > > > who use another dwc controller, which they already implemented
> > > > "implementation-specific" by only update dts to provide hardware
> > > > information.(I guest he use ITS's MSI64)
> > > >
> > > > Because it is new patches, I have not added Niklas's test-by tag. There
> > > > are not big functional change since Nikas test. The major change is make
> > > > msi part better align current MSI framework according to Thomas's
> > > > suggestion.
> > >
> > > Thomas Gleixner and Marc Zyngier:
> > >
> > > Happy new year! Do you have additioinal comments for this?
> >
> > I think this is pretty pointless.
> >
> > - DOMAIN_BUS_DEVICE_PCI_EP_MSI is just a reinvention of platform MSI,
> > and I don't see why we need to have *two* square wheels
>
> "DOMAIN_BUS_DEVICE_PCI_EP_MSI" was purposed by Thomas Gleixner.
> https://lore.kernel.org/linux-pci/87o7197wbx.ffs@tglx/, the difference
> is
> - "platform MSI" only have one device id for each device. But
> DOMAIN_BUS_DEVICE_PCI_EP_MSI have multi child devices, which need map to
> difference devices id.
>
> If you like, I can try to extend "platform msi" to support msi-map. But
> The other problem "immutable MSI messages" need be resolve also.
>
> PCIe EP require "immutable MSI messages". address/data pair can't be change
> by set irq affinity.
>
> >
> > - The whole thing relies on IMPDEF behaviours that are not described,
> > making it impossible to write a *host* driver that works
> > universally.
>
> Host side need NOT know EP's side IMPDEF behaviours. Host side just know
> addr/data pair. *(BAR<n> + offset) = DATA (in RC side) will trigger
> doorbell.
>
> "AXI user bits" concern see below book mark axi.
>
> > Specifically, you completely ignored my comment about
> > *how* is the DevID sampled on the ITS side.
>
> See above "book mark 1", let me change another words, descript again,
>
> It is quite similar with PCI root complex case.
>
> In PCI RC's dts, it looks:
>
> pci {
> ...
> msi-map = <0 &its, 0x<8_0000, 0xff>;
> ^, ctrl ID.
> ...
> }
>
> ITS call pci_msi_domain_get_msi_rid() to get device id.
>
> static int its_pci_msi_prepare(struct irq_domain *domain, struct device *dev,
> int nvec, msi_alloc_info_t *info)
> {
> ...
> info->scratchpad[0].ul = pci_msi_domain_get_msi_rid(domain->parent, pdev);
> ...
> }
>
> PCI msi common code call __of_msi_map_id() to convert PCI rid to stream id
> from dts file. It should have similar method if device have not use DT.
>
> --- EP case (Run at EP side) ---
>
> for my patches, it do similar thing, in dts, PCI EP controller
>
> pci-ep {
> msi-map = <0 &its, 0x<8_0000, 0xff>;
> msi-mask = <0xff>;
> }
>
> static int its_pci_ep_msi_prepare(struct irq_domain *domain, struct device *dev,
> int nvec, msi_alloc_info_t *info)
> {
>
> ....
> ret = pci_epf_msi_domain_get_msi_rid(dev, &dev_id);
> ....
> }
>
> PCIe EP common part will convert EP function device ID to difference device
> id according to msi-map in pci-ep node.
>
> > How is that supposed to
> > work when the DevID is carried as AXI user bits instead of data? How
> > can the host provide that information?
>
> book mark AXI:
>
> Host driver needn't such information. Host write PCI TLP, such as
> *ADDR = DATA.
>
> PCI EP controller get such TLP, which convert to AXI write. PCI EP
> controller will add AXI user bits, which was descripted in PCI EP
> controller's dts file.
>
> pci-ep {
> msi-map = <0 &its, 0x<8_0000, 0xff>;
> ^ "8" is AXI user bits, which added when
> convert TLP to axi write.
>
> }
>
> >
> > - your "but it's been tested by..." argument doesn't carry much
> > weight, as the kernel has at least one critical bug per "Tested-by"
> > tag
>
> My means is this solution can cross difference platform with only dts
> change.
>
> >
> > Given that, I don't see how this series is fit for purpose.
>
> sorry for add book mark to refer to difference place in the the mail.
>
> let me know if need further description.
>
Marc Zyngier:
Do you have any chance to read my reply? Does it anwser your
question? anything missed or still unclear? How to move forward?
Frank
> Frank
>
> >
> > Thanks,
> >
> > M.
> >
> > --
> > Without deviation from the norm, progress is not possible.
Powered by blists - more mailing lists