[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <DEOMT00FUY8M.2J9C1SM9BOTTH@linux.ibm.com>
Date: Wed, 03 Dec 2025 14:55:44 +0100
From: "Tobias Schumacher" <ts@...ux.ibm.com>
To: "Gerd Bayer" <gbayer@...ux.ibm.com>,
"Tobias Schumacher"
<ts@...ux.ibm.com>,
"Heiko Carstens" <hca@...ux.ibm.com>,
"Vasily Gorbik"
<gor@...ux.ibm.com>,
"Alexander Gordeev" <agordeev@...ux.ibm.com>,
"Christian Borntraeger" <borntraeger@...ux.ibm.com>,
"Sven Schnelle"
<svens@...ux.ibm.com>,
"Niklas Schnelle" <schnelle@...ux.ibm.com>,
"Gerald
Schaefer" <gerald.schaefer@...ux.ibm.com>,
"Halil Pasic"
<pasic@...ux.ibm.com>,
"Matthew Rosato" <mjrosato@...ux.ibm.com>,
"Thomas
Gleixner" <tglx@...utronix.de>
Cc: <linux-kernel@...r.kernel.org>, <linux-s390@...r.kernel.org>
Subject: Re: [PATCH v7 2/2] s390/pci: Migrate s390 IRQ logic to IRQ domain
API
On Wed Dec 3, 2025 at 1:32 PM CET, Gerd Bayer wrote:
> On Wed, 2025-12-03 at 08:53 +0100, Tobias Schumacher wrote:
>> On Tue Dec 2, 2025 at 7:14 PM CET, Gerd Bayer wrote:
>> > On Thu, 2025-11-27 at 16:07 +0100, Tobias Schumacher wrote:
>> > [ ... snip ... ]
>> >
>> > > diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
>> > > index e73be96ce5fe6473fc193d65b8f0ff635d6a98ba..2ac0fab605a83a2f06be6a0a68718e528125ced6 100644
>> > > --- a/arch/s390/pci/pci_irq.c
>> > > +++ b/arch/s390/pci/pci_irq.c
>> > > @@ -290,146 +325,196 @@ static int __alloc_airq(struct zpci_dev *zdev, int msi_vecs,
>> > > return 0;
>> > > }
>> > >
>> > > -int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
>> > > +bool arch_restore_msi_irqs(struct pci_dev *pdev)
>> > > {
>> > > - unsigned int hwirq, msi_vecs, irqs_per_msi, i, cpu;
>> > > struct zpci_dev *zdev = to_zpci(pdev);
>> > > - struct msi_desc *msi;
>> > > - struct msi_msg msg;
>> > > - unsigned long bit;
>> > > - int cpu_addr;
>> > > - int rc, irq;
>> > >
>> > > + zpci_set_irq(zdev);
>> > > + return true;
>> > > +}
>> > >
>> >
>> > It's always a little tricky to distinguish which code handles both MSI
>> > and MSI-X or just MSI proper when routines have _msi_ in their name.
>> > But apparently, both __pci_restore_msi_state() and
>> > __pci_restore_msix_state() inside pci_restore_msi_state() do call
>> > arch_restore_msi_irqs() - so life is good!
>>
>> Regarding arch_restore_msi_irqs() the main change in the patchset is
>> that it is now also conditionally called from zpci_reenable_device().
Right, sorry.
> Sorry, I don't follow: This patch adds a conditional call to
> zpci_set_irg() to zpci_reenable_device() - not arch_restore_msi_irqs().
>
>> This is becasue in the recovery case, __pci_restore_msix_state() does
>> not call arch_restore_msi_irqs(), it exits directly at the beginning
>> because dev->msix_enabled evaluates to false.
>
> Does that mean arch_restore_msi_irqs() is dead code?
> After re-reading pci_save_state()/pci_restore_state(), it sounds as if
> arch_restore_msi_irqs() may be useful afterall, with device drivers
> that consider the MSI/MSI-X interrupt setup part of their save/restore
> snapshot? And we just happen to have not executed any of those, maybe?
>
> So probably just leave it in.
No, it's not dead code. After the zpcictl --reset-fw, MSI-X interrupts
are shutdown before the pci_restore_state(), which is why
arch_restore_msi_irqs() is not called. But a driver can still call
pci_save_state() and pci_restore_state() without shutting down MSI IRQs
before, in which case arch_restore_msi_irqs() is called.
>> With the legacy API, IRQs are later re-enabled using
>> arch_setup_msi_irqs(), which also registers the airq with the hw. With
>> the MSI parent domain, zpci_msi_prepare() would register the airq, but
>> is not called in the recovery path. This is why it is now added to
>> zpci_reenable_device()
>>
>>
>> > [ ... snip ... ]
>> >
>> > > +static void zpci_msi_domain_free(struct irq_domain *domain, unsigned int virq,
>> > > + unsigned int nr_irqs)
>> > > +{
>> > > + struct irq_data *d;
>> > > + int i;
>> > >
>> > > - return (zdev->msi_nr_irqs == nvec) ? 0 : zdev->msi_nr_irqs;
>> > > + for (i = 0; i < nr_irqs; i++) {
>> > > + d = irq_domain_get_irq_data(domain, virq + i);
>> > > + irq_domain_reset_irq_data(d);
>> >
>> > Question: zpci_msi_alloc_domain() did modify airq data, can this be
>> > left as is in zpci_msi_domain_free()?
>>
>> I was thinking about this myself and came to the conclusion that it is
>> fine. zpci_msi_domain_alloc() sets the ptr to the msi parent domain and
>> data to the encoded hwirq. Both fields are only required in the IRQ
>> handler.
>> * When free() is called, the corresponding interrupt was already
>> deactivated by the hardware, so hardware shouldn't generate it
>> anymore anyway.
>> * If, for whatever reason, hw still generates the interrupt,
>> generic_handle_domain_irq returns an error since the hwirq cannot be
>> resolved.
>> * If the IRQ gets allocated again, the fields are written again before
>> the IRQ is activated. The data written will even be the same
>> as it was before.
>
> Well, this is all assuming no further errors in the code...
> I'd still vote to clean up airq resources when they are no longer
> needed - just act defensively in case some weird (future) path still
> tries to use these after they got put to rest - or you have to do some
> post-mortem dump analysis and try to make sense of this "garbage".
Ok, I can do that.
>> > [ ... snip ... ]
>> >
>> > > @@ -466,6 +551,7 @@ static int __init zpci_directed_irq_init(void)
>> > > * is only done on the first vector.
>> > > */
>> > > zpci_ibv[cpu] = airq_iv_create(cache_line_size() * BITS_PER_BYTE,
>> > > + AIRQ_IV_PTR |
>> > > AIRQ_IV_DATA |
>> > > AIRQ_IV_CACHELINE |
>> > > (!cpu ? AIRQ_IV_ALLOC : 0), NULL);
>> >
>> >
>> > This looks very good to me already. Unfortunately, I was unable to
>> > relieve my MSI vs. MSI-X anxiety regarding arch_restore_msi_irqs() with
>> > a test since the only MSI-using PCI function (ISM) is not supporting
>> > PCI auto-recovery :(
>> >
>> > But a mlx5 VF now recovers just fine!
>>
>> Did my expanation above help with this?
>
> Yes, thank you. But I still would request to address the airq cleanup
> in zpci_msi_domain_free().
Ok, will do.
Thanks
Tobias
Powered by blists - more mailing lists