[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aD7B96BiSb6mK9Bj@lpieralisi>
Date: Tue, 3 Jun 2025 11:35:51 +0200
From: Lorenzo Pieralisi <lpieralisi@...nel.org>
To: Zenghui Yu <yuzenghui@...wei.com>
Cc: Marc Zyngier <maz@...nel.org>, linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org,
Thomas Gleixner <tglx@...utronix.de>,
Sascha Bischoff <sascha.bischoff@....com>,
Timothy Hayes <timothy.hayes@....com>
Subject: Re: [PATCH v2 3/5] genirq/msi: Move prepare() call to per-device
allocation
On Tue, Jun 03, 2025 at 04:22:47PM +0800, Zenghui Yu wrote:
> Hi Marc,
>
> On 2025/5/14 0:31, Marc Zyngier wrote:
> > The current device MSI infrastructure is subtly broken, as it
> > will issue an .msi_prepare() callback into the MSI controller
> > driver every time it needs to allocate an MSI. That's pretty wrong,
> > as the contract (or unwarranted assumption, depending who you ask)
> > between the MSI controller and the core code is that .msi_prepare()
> > is called exactly once per device.
> >
> > This leads to some subtle breakage in said MSI controller drivers,
> > as it gives the impression that there are multiple endpoints sharing
> > a bus identifier (RID in PCI parlance, DID for GICv3+). It implies
> > that whatever allocation the ITS driver (for example) has done on
> > behalf of these devices cannot be undone, as there is no way to
> > track the shared state. This is particularly bad for wire-MSI devices,
> > for which .msi_prepare() is called for. each. input. line.
> >
> > To address this issue, move the call to .msi_prepare() to take place
> > at the point of irq domain allocation, which is the only place that
> > makes sense. The msi_alloc_info_t structure is made part of the
> > msi_domain_template, so that its life-cycle is that of the domain
> > as well.
> >
> > Finally, the msi_info::alloc_data field is made to point at this
> > allocation tracking structure, ensuring that it is carried around
> > the block.
> >
> > This is all pretty straightforward, except for the non-device-MSI
> > leftovers, which still have to call .msi_prepare() at the old
> > spot. One day...
> >
> > Signed-off-by: Marc Zyngier <maz@...nel.org>
> > ---
> > include/linux/msi.h | 2 ++
> > kernel/irq/msi.c | 35 +++++++++++++++++++++++++++++++----
> > 2 files changed, 33 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/linux/msi.h b/include/linux/msi.h
> > index 63c23003ec9b7..ba1c77a829a1c 100644
> > --- a/include/linux/msi.h
> > +++ b/include/linux/msi.h
> > @@ -516,12 +516,14 @@ struct msi_domain_info {
> > * @chip: Interrupt chip for this domain
> > * @ops: MSI domain ops
> > * @info: MSI domain info data
> > + * @alloc_info: MSI domain allocation data (arch specific)
> > */
> > struct msi_domain_template {
> > char name[48];
> > struct irq_chip chip;
> > struct msi_domain_ops ops;
> > struct msi_domain_info info;
> > + msi_alloc_info_t alloc_info;
> > };
> >
> > /*
> > diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
> > index 31378a2535fb9..07eb857efd15e 100644
> > --- a/kernel/irq/msi.c
> > +++ b/kernel/irq/msi.c
> > @@ -59,7 +59,8 @@ struct msi_ctrl {
> > static void msi_domain_free_locked(struct device *dev, struct msi_ctrl *ctrl);
> > static unsigned int msi_domain_get_hwsize(struct device *dev, unsigned int domid);
> > static inline int msi_sysfs_create_group(struct device *dev);
> > -
> > +static int msi_domain_prepare_irqs(struct irq_domain *domain, struct device *dev,
> > + int nvec, msi_alloc_info_t *arg);
> >
> > /**
> > * msi_alloc_desc - Allocate an initialized msi_desc
> > @@ -1023,6 +1024,7 @@ bool msi_create_device_irq_domain(struct device *dev, unsigned int domid,
> > bundle->info.ops = &bundle->ops;
> > bundle->info.data = domain_data;
> > bundle->info.chip_data = chip_data;
> > + bundle->info.alloc_data = &bundle->alloc_info;
> >
> > pops = parent->msi_parent_ops;
> > snprintf(bundle->name, sizeof(bundle->name), "%s%s-%s",
> > @@ -1061,11 +1063,18 @@ bool msi_create_device_irq_domain(struct device *dev, unsigned int domid,
> > if (!domain)
> > return false;
> >
> > + domain->dev = dev;
> > + dev->msi.data->__domains[domid].domain = domain;
> > +
> > + if (msi_domain_prepare_irqs(domain, dev, hwsize, &bundle->alloc_info)) {
>
> Does it work for MSI?
This means that it does not work for MSI for you as it stands, right ?
If you spotted an issue, thanks for that, report it fully please.
> hwsize is 1 in the MSI case, without taking pci_msi_vec_count() into account.
>
> bool pci_setup_msi_device_domain(struct pci_dev *pdev)
> {
> [...]
>
> return pci_create_device_domain(pdev, &pci_msi_template, 1);
I had a stab at it with GICv5 models and an MSI capable device and this indeed
calls the ITS msi_prepare() callback with 1 as vector count, so we size
the device tables wrongly.
The question is why pci_create_device_domain() is called here with
hwsize == 1. Probably, before this series, the ITS MSI parent code was
fixing the size up so we did not notice, I need to check.
Lorenzo
Powered by blists - more mailing lists