lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z3we4QuRo5ou+r2y@lizhi-Precision-Tower-5810>
Date: Mon, 6 Jan 2025 13:20:17 -0500
From: Frank Li <Frank.li@....com>
To: Marc Zyngier <maz@...nel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>,
	Krzysztof Wilczyński <kw@...ux.com>,
	Kishon Vijay Abraham I <kishon@...nel.org>,
	Bjorn Helgaas <bhelgaas@...gle.com>, Arnd Bergmann <arnd@...db.de>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	"Rafael J. Wysocki" <rafael@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Anup Patel <apatel@...tanamicro.com>, linux-kernel@...r.kernel.org,
	linux-pci@...r.kernel.org, imx@...ts.linux.dev,
	Niklas Cassel <cassel@...nel.org>, dlemoal@...nel.org,
	jdmason@...zu.us, linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH v13 4/9] irqchip/gic-v3-its: Add
 DOMAIN_BUS_DEVICE_PCI_EP_MSI support

On Mon, Jan 06, 2025 at 05:13:10PM +0000, Marc Zyngier wrote:
> On Mon, 06 Jan 2025 16:38:02 +0000,
> Frank Li <Frank.li@....com> wrote:
> >
> > On Thu, Dec 19, 2024 at 12:02:02PM -0500, Frank Li wrote:
> > > On Thu, Dec 19, 2024 at 10:52:30AM +0000, Marc Zyngier wrote:
> > > > On Wed, 18 Dec 2024 23:08:39 +0000,
> > > > Frank Li <Frank.Li@....com> wrote:
> > > > >
> > > > >            ┌────────────────────────────────┐
> > > > >            │                                │
> > > > >            │     PCI Endpoint Controller    │
> > > > >            │                                │
> > > > >            │   ┌─────┐  ┌─────┐     ┌─────┐ │
> > > > > PCI Bus    │   │     │  │     │     │     │ │
> > > > > ─────────► │   │Func1│  │Func2│ ... │Func │ │
> > > > > Doorbell   │   │     │  │     │     │<n>  │ │
> > > > >            │   │     │  │     │     │     │ │
> > > > >            │   └──┬──┘  └──┬──┘     └──┬──┘ │
> > > > >            │      │        │           │    │
> > > > >            └──────┼────────┼───────────┼────┘
> > > > >                   │        │           │
> > > > >                   ▼        ▼           ▼
> > > > >                ┌────────────────────────┐
> > > > >                │    MSI Controller      │
> > > > >                └────────────────────────┘
> > > > >
> > > > > Add domain DOMAIN_BUS_DEVICE_PCI_EP_MSI to allocate MSI domain for Endpoint
> > > > > function in PCI Endpoint (EP) controller, So PCI Root Complex (RC) can
> > > > > write MSI message to MSI controller to trigger doorbell IRQ for difference
> > > > > EP functions.
> > > > >
> > > > > Signed-off-by: Frank Li <Frank.Li@....com>
> > > > > ---
> > > > > change from v12 to v13
> > > > > - new patch
> > > >
> > > > This might be v13, but after all this time, I have no idea what you
> > > > are trying to do. You keep pasting this non-ASCII drawing in commit
> > > > messages, but I still have no idea what this PCI Bus Doorbell
> > > > represents.
> > >
> > > PCI Bus/Doorbell is two words. Basic over picture is a PCI EP devices (such
> > > as imx95), which run linux and PCI Endpoint framework. i.MX95 connect to
> > > PCI Host, such as PC (x86).
> > >
> > > i.MX95 can use standard PCI MSI framework to issue a irq to X86. but there
> > > are not reverse direction. X86 try write some MMIO register ( mapped PCI
> > > bar0). But i.MX95 don't know it have been modified. So currently solution
> > > is create a polling thread to check every 10ms.
> > >
> > > So this patches try resolve this problem at the platform, which have MSI
> > > controller such as ITS.
> > >
> > > after this patches, i.MX95 can create a PCI Bar1, which map to MSI
> > > controller register space,  when X86 write data to Bar1 (call as doorbell),
> > > a irq will be triggered at i.MX95.
> > >
> > > Doorbell in diagram means 'push doorbell' (write data to bar<n>).
> > >
> > > >
> > > > I appreciate the knowledge shortage is on my end, but it would
> > > > definitely help if someone would take the time to explain what this is
> > > > all about.
> > >
> > > I am not sure if diagram in corver letter can help this, or above
> > > descriptions is enough.
> > >
> > >
> > > ┌────────────┐   ┌───────────────────────────────────┐   ┌────────────────┐
> > > │            │   │                                   │   │                │
> > > │            │   │ PCI Endpoint                      │   │ PCI Host       │
> > > │            │   │                                   │   │                │
> > > │            │◄──┤ 1.platform_msi_domain_alloc_irqs()│   │                │
> > > │            │   │                                   │   │                │
> > > │ MSI        ├──►│ 2.write_msi_msg()                 ├──►├─BAR<n>         │
> > > │ Controller │   │   update doorbell register address│   │                │
> > > │            │   │   for BAR                         │   │                │
> > > │            │   │                                   │   │ 3. Write BAR<n>│
> > > │            │◄──┼───────────────────────────────────┼───┤                │
> > > │            │   │                                   │   │                │
> > > │            ├──►│ 4.Irq Handle                      │   │                │
> > > │            │   │                                   │   │                │
> > > │            │   │                                   │   │                │
> > > └────────────┘   └───────────────────────────────────┘   └────────────────┘
> > > (* some detail have been changed and don't affect understand overall
> > > picture)
> > >
> > > >
> > > > From what I gather, the ITS is actually on an end-point, and get
> > > > writes from the host, but that doesn't answer much.
> > >
> > > Yes, baisc it is correct. PCI RC -> PCIe Bus TLP -> PCI Endpoint Ctrl ->
> > > AXI transaction -> ITS MMIO map register -> CPU IRQ.
> > >
> > > The major problem how to distingiush from difference PCI Endpoint function
> > > driver. There are have many EP functions as much as 8, which quite similar
> > > standard PCI, one PCIe device can have 8 physical functions.
> > >
> > > >
> > > > > ---
> > > > >  drivers/irqchip/irq-gic-v3-its-msi-parent.c | 19 ++++++++++++++++++-
> > > > >  1 file changed, 18 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/irqchip/irq-gic-v3-its-msi-parent.c b/drivers/irqchip/irq-gic-v3-its-msi-parent.c
> > > > > index b2a4b67545b82..16e7d53f0b133 100644
> > > > > --- a/drivers/irqchip/irq-gic-v3-its-msi-parent.c
> > > > > +++ b/drivers/irqchip/irq-gic-v3-its-msi-parent.c
> > > > > @@ -5,6 +5,7 @@
> > > > >  // Copyright (C) 2022 Intel
> > > > >
> > > > >  #include <linux/acpi_iort.h>
> > > > > +#include <linux/pci-ep-msi.h>
> > > > >  #include <linux/pci.h>
> > > > >
> > > > >  #include "irq-gic-common.h"
> > > > > @@ -173,6 +174,19 @@ static int its_pmsi_prepare(struct irq_domain *domain, struct device *dev,
> > > > >  	return its_pmsi_prepare_devid(domain, dev, nvec, info, dev_id);
> > > > >  }
> > > > >
> > > > > +static int its_pci_ep_msi_prepare(struct irq_domain *domain, struct device *dev,
> > > > > +				  int nvec, msi_alloc_info_t *info)
> > > > > +{
> > > > > +	u32 dev_id;
> > > > > +	int ret;
> > > > > +
> > > > > +	ret = pci_epf_msi_domain_get_msi_rid(dev, &dev_id);
> > > >
> > > > What this doesn't express is *how* are the writes conveyed to the ITS.
> > > > Specifically, the DevID is normally sampled as sideband information at
> > > > during the write transaction.
> > >
> > > Like PCI host, there msi-map in dts file, which descript how map PCI RID
> > > to DevID, such as
> > > 	msi-map = <0 $its 0x80 8>;
> > >
> > > This informtion should be descripted in DTS or ACPI ...
> > >
> > > >
> > > > Obviously, you can't do that over PCI. So there is a lot of
> > > > undisclosed assumption about how the ITS is integrated, and how it
> > > > samples the DevID.
> > >
> > > Yes, it should be platform PCI endpoint ctrl driver jobs.  Platform EP
> > > driver should implement this type of covert. Such as i.MX95, there are
> > > hardware call LUT in PCI ctrl,  which can convert PCI' request ID to DevID
> > > here.
> > >
> > > On going patch may help understand these
> > > https://lore.kernel.org/linux-pci/20241210-imx95_lut-v8-0-2e730b2e5fde@nxp.com/
> > >
> > > If use latest ITS MSI64 should be simple, only need descript it at DTS
> > > (I have not hardware to test this case yet).
> > > pci-ep {
> > > 	...
> > > 	msi-map = <0 &its, 0x<8_0000, 0xff>;
> > > 			      ^, ctrl ID.
> > > 	msi-mask = <0xff>;
> > > 	...
> > > }

Bookmark 1

> > >
> > > >
> > > > My conclusion is that this is not as generic as it seems to be. It is
> > > > definitely tied to implementation-specific behaviours, none of which
> > > > are explained.
> > >
> > > Compared to standard PCI MSI, which also have implementation-specific
> > > behaviours, which convert PCI request ID to DevID Or stream ID.
> > > https://lore.kernel.org/linux-pci/20241210-imx95_lut-v8-0-2e730b2e5fde@nxp.com/
> > > (I have struggle this for almost one year for this implementation-specific
> > > part)
> > >
> > > Well defined and mature PCI standard, MSI still need two parts, common part
> > > and "implementation-specific" part.
> > >
> > > Common part of standard PCI is at several place, such its driver/msi
> > > libary/ kernel msi code ...
> > >
> > > "implementation-specific" part is in PCI host bridge driver, such as
> > > drivers/pci/controller/dwc/pcie-qcom.c
> > >
> > > This solution already test by Tested-by: Niklas Cassel <cassel@...nel.org>
> > > who use another dwc controller, which they already implemented
> > > "implementation-specific" by only update dts to provide hardware
> > > information.(I guest he use ITS's MSI64)
> > >
> > > Because it is new patches, I have not added Niklas's test-by tag. There
> > > are not big functional change since Nikas test. The major change is make
> > > msi part better align current MSI framework according to Thomas's
> > > suggestion.
> >
> > Thomas Gleixner and Marc Zyngier:
> >
> > Happy new year! Do you have additioinal comments for this?
>
> I think this is pretty pointless.
>
> - DOMAIN_BUS_DEVICE_PCI_EP_MSI is just a reinvention of platform MSI,
>   and I don't see why we need to have *two* square wheels

"DOMAIN_BUS_DEVICE_PCI_EP_MSI" was purposed by Thomas Gleixner.
https://lore.kernel.org/linux-pci/87o7197wbx.ffs@tglx/, the difference
is
- "platform MSI" only have one device id for each device. But
DOMAIN_BUS_DEVICE_PCI_EP_MSI have multi child devices, which need map to
difference devices id.

If you like, I can try to extend "platform msi" to support msi-map. But
The other problem "immutable MSI messages" need be resolve also.

PCIe EP require "immutable MSI messages". address/data pair can't be change
by set irq affinity.

>
> - The whole thing relies on IMPDEF behaviours that are not described,
>   making it impossible to write a *host* driver that works
>   universally.

Host side need NOT know EP's side IMPDEF behaviours. Host side just know
addr/data pair.  *(BAR<n> + offset) = DATA (in RC side) will trigger
doorbell.

"AXI user bits" concern see below book mark axi.

> Specifically, you completely ignored my comment about
>   *how* is the DevID sampled on the ITS side.

See above "book mark 1", let me change another words, descript again,

It is quite similar with PCI root complex case.

In PCI RC's dts, it looks:

pci {
	...
	msi-map = <0 &its, 0x<8_0000, 0xff>;
	                      ^, ctrl ID.
	...
}

ITS call pci_msi_domain_get_msi_rid() to get device id.

static int its_pci_msi_prepare(struct irq_domain *domain, struct device *dev,
                               int nvec, msi_alloc_info_t *info)
{
	...
        info->scratchpad[0].ul = pci_msi_domain_get_msi_rid(domain->parent, pdev);
	...
}

PCI msi common code call __of_msi_map_id() to convert PCI rid to stream id
from dts file.  It should have similar method if device have not use DT.

--- EP case (Run at EP side) ---

for my patches, it do similar thing, in dts, PCI EP controller

pci-ep {
	msi-map = <0 &its, 0x<8_0000, 0xff>;
	msi-mask = <0xff>;
}

static int its_pci_ep_msi_prepare(struct irq_domain *domain, struct device *dev,
				  int nvec, msi_alloc_info_t *info)
{

....
	ret = pci_epf_msi_domain_get_msi_rid(dev, &dev_id);
....
}

PCIe EP common part will convert EP function device ID to difference device
id according to msi-map in pci-ep node.

>  How is that supposed to
>   work when the DevID is carried as AXI user bits instead of data? How
>   can the host provide that information?

book mark AXI:

Host driver needn't such information. Host write PCI TLP, such as
*ADDR = DATA.

PCI EP controller get such TLP, which convert to AXI write. PCI EP
controller will add AXI user bits, which was descripted in PCI EP
controller's dts file.

pci-ep {
        msi-map = <0 &its, 0x<8_0000, 0xff>;
			      ^ "8" is AXI user bits, which added when
convert TLP to axi write.

}

>
> - your "but it's been tested by..." argument doesn't carry much
>   weight, as the kernel has at least one critical bug per "Tested-by"
>   tag

My means is this solution can cross difference platform with only dts
change.

>
> Given that, I don't see how this series is fit for purpose.

sorry for add book mark to refer to difference place in the the mail.

let me know if need further description.

Frank

>
> Thanks,
>
> 	M.
>
> --
> Without deviation from the norm, progress is not possible.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ