[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87jz5tdl9d.wl-maz@kernel.org>
Date: Tue, 03 Jun 2025 14:09:50 +0100
From: Marc Zyngier <maz@...nel.org>
To: Lorenzo Pieralisi <lpieralisi@...nel.org>
Cc: Zenghui Yu <yuzenghui@...wei.com>,
linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org,
Thomas Gleixner <tglx@...utronix.de>,
Sascha Bischoff <sascha.bischoff@....com>,
Timothy Hayes <timothy.hayes@....com>
Subject: Re: [PATCH v2 3/5] genirq/msi: Move prepare() call to per-device allocation
On Tue, 03 Jun 2025 10:35:51 +0100,
Lorenzo Pieralisi <lpieralisi@...nel.org> wrote:
>
> On Tue, Jun 03, 2025 at 04:22:47PM +0800, Zenghui Yu wrote:
> > > + domain->dev = dev;
> > > + dev->msi.data->__domains[domid].domain = domain;
> > > +
> > > + if (msi_domain_prepare_irqs(domain, dev, hwsize, &bundle->alloc_info)) {
> >
> > Does it work for MSI?
>
> This means that it does not work for MSI for you as it stands, right ?
>
> If you spotted an issue, thanks for that, report it fully please.
Honestly, you're barking up the wrong tree. Zenghui points us to a
glaring bug in the core code, with detailed information on what could
go wrong, as well as what is wrong in the code. It doesn't get better
than that.
The usual level of bug reports is "its b0rken", sometimes followed by
a trace with lots of hex and no information. Spot the difference?
>
> > hwsize is 1 in the MSI case, without taking pci_msi_vec_count() into account.
> >
> > bool pci_setup_msi_device_domain(struct pci_dev *pdev)
> > {
> > [...]
> >
> > return pci_create_device_domain(pdev, &pci_msi_template, 1);
>
> I had a stab at it with GICv5 models and an MSI capable device and this indeed
> calls the ITS msi_prepare() callback with 1 as vector count, so we size
> the device tables wrongly.
Not wrongly. Exactly as instructed.
>
> The question is why pci_create_device_domain() is called here with
> hwsize == 1. Probably, before this series, the ITS MSI parent code was
> fixing the size up so we did not notice, I need to check.
The GICv3 ITS code would upgrade the vector count to the next power of
two (one bit of EID space -> 2 MSIs), but with the device domain
squarely set to 1, the endpoint driver would never get more. It is
prepared to fail gracefully though, hence nothing really breaks.
I don't think this patch makes anything regress though. Commit
15c72f824b327 seems to be the offending one. If Zenghui confirms that
the hack I posted separately works for him, I'll follow up with a
"real" patch.
M.
--
Jazz isn't dead. It just smells funny.
Powered by blists - more mailing lists