lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7dea38f38180bd6b5305f72a366ef3df066000de.camel@linux.ibm.com>
Date: Wed, 03 Dec 2025 13:32:49 +0100
From: Gerd Bayer <gbayer@...ux.ibm.com>
To: Tobias Schumacher <ts@...ux.ibm.com>, Heiko Carstens
 <hca@...ux.ibm.com>,
        Vasily Gorbik <gor@...ux.ibm.com>,
        Alexander Gordeev
 <agordeev@...ux.ibm.com>,
        Christian Borntraeger	
 <borntraeger@...ux.ibm.com>,
        Sven Schnelle <svens@...ux.ibm.com>,
        Niklas
 Schnelle <schnelle@...ux.ibm.com>,
        Gerald Schaefer
 <gerald.schaefer@...ux.ibm.com>,
        Halil Pasic	 <pasic@...ux.ibm.com>,
        Matthew Rosato <mjrosato@...ux.ibm.com>,
        Thomas Gleixner	
 <tglx@...utronix.de>
Cc: linux-kernel@...r.kernel.org, linux-s390@...r.kernel.org
Subject: Re: [PATCH v7 2/2] s390/pci: Migrate s390 IRQ logic to IRQ domain
 API

On Wed, 2025-12-03 at 08:53 +0100, Tobias Schumacher wrote:
> On Tue Dec 2, 2025 at 7:14 PM CET, Gerd Bayer wrote:
> > On Thu, 2025-11-27 at 16:07 +0100, Tobias Schumacher wrote:
> >   [ ... snip ... ]
> > 
> > > diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
> > > index e73be96ce5fe6473fc193d65b8f0ff635d6a98ba..2ac0fab605a83a2f06be6a0a68718e528125ced6 100644
> > > --- a/arch/s390/pci/pci_irq.c
> > > +++ b/arch/s390/pci/pci_irq.c
> > > @@ -290,146 +325,196 @@ static int __alloc_airq(struct zpci_dev *zdev, int msi_vecs,
> > >  	return 0;
> > >  }
> > >  
> > > -int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
> > > +bool arch_restore_msi_irqs(struct pci_dev *pdev)
> > >  {
> > > -	unsigned int hwirq, msi_vecs, irqs_per_msi, i, cpu;
> > >  	struct zpci_dev *zdev = to_zpci(pdev);
> > > -	struct msi_desc *msi;
> > > -	struct msi_msg msg;
> > > -	unsigned long bit;
> > > -	int cpu_addr;
> > > -	int rc, irq;
> > >  
> > > +	zpci_set_irq(zdev);
> > > +	return true;
> > > +}
> > > 
> > 
> > It's always a little tricky to distinguish which code handles both MSI
> > and MSI-X or just MSI proper when routines have _msi_ in their name.
> > But apparently, both __pci_restore_msi_state() and
> > __pci_restore_msix_state() inside pci_restore_msi_state() do call
> > arch_restore_msi_irqs() - so life is good!
> 
> Regarding arch_restore_msi_irqs() the main change in the patchset is
> that it is now also conditionally  called from zpci_reenable_device().

Sorry, I don't follow: This patch adds a conditional call to
zpci_set_irg() to zpci_reenable_device() - not arch_restore_msi_irqs().

> This is becasue in the recovery case, __pci_restore_msix_state() does
> not call arch_restore_msi_irqs(), it exits directly at the beginning
> because dev->msix_enabled evaluates to false.

Does that mean arch_restore_msi_irqs() is dead code?
After re-reading pci_save_state()/pci_restore_state(), it sounds as if
arch_restore_msi_irqs() may be useful afterall, with device drivers
that consider the MSI/MSI-X interrupt setup part of their save/restore
snapshot? And we just happen to have not executed any of those, maybe?

So probably just leave it in.

> With the legacy API, IRQs are later re-enabled using
> arch_setup_msi_irqs(), which also registers the airq with the hw. With
> the MSI parent domain, zpci_msi_prepare() would register the airq, but
> is not called in the recovery path. This is why it is now added to
> zpci_reenable_device()
> 
> 
> >   [ ... snip ... ]
> > 
> > > +static void zpci_msi_domain_free(struct irq_domain *domain, unsigned int virq,
> > > +				 unsigned int nr_irqs)
> > > +{
> > > +	struct irq_data *d;
> > > +	int i;
> > >  
> > > -	return (zdev->msi_nr_irqs == nvec) ? 0 : zdev->msi_nr_irqs;
> > > +	for (i = 0; i < nr_irqs; i++) {
> > > +		d = irq_domain_get_irq_data(domain, virq + i);
> > > +		irq_domain_reset_irq_data(d);
> > 
> > Question: zpci_msi_alloc_domain() did modify airq data, can this be
> > left as is in zpci_msi_domain_free()?
> 
> I was thinking about this myself and came to the conclusion that it is
> fine. zpci_msi_domain_alloc() sets the ptr to the msi parent domain and
> data to the encoded hwirq. Both fields are only required in the IRQ
> handler.
> * When free() is called, the corresponding interrupt was already
>   deactivated by the hardware, so hardware shouldn't generate it
>   anymore anyway.
> * If, for whatever reason, hw still generates the interrupt,
>   generic_handle_domain_irq returns an error since the hwirq cannot be
>   resolved.
> * If the IRQ gets allocated again, the fields are written again before
>   the IRQ is activated. The data written will even be the same
>   as it was before.

Well, this is all assuming no further errors in the code...
I'd still vote to clean up airq resources when they are no longer
needed - just act defensively in case some weird (future) path still
tries to use these after they got put to rest - or you have to do some
post-mortem dump analysis and try to make sense of this "garbage".

> 
> >    [ ... snip ... ]
> > 
> > 
> > > +
> > > +int zpci_create_parent_msi_domain(struct zpci_bus *zbus)
> > > +{
> > > +	char fwnode_name[18];
> > >  
> > > -	if (zdev->aisb != -1UL) {
> > > -		zpci_ibv[zdev->aisb] = NULL;
> > > -		airq_iv_free_bit(zpci_sbv, zdev->aisb);
> > > -		zdev->aisb = -1UL;
> > > +	snprintf(fwnode_name, sizeof(fwnode_name), "ZPCI_MSI_DOM_%04x", zbus->domain_nr);
> > > +	struct irq_domain_info info = {
> > > +		.fwnode		= irq_domain_alloc_named_fwnode(fwnode_name),
> > > +		.ops		= &zpci_msi_domain_ops,
> > > +	};
> > > +
> > > +	if (!info.fwnode) {
> > > +		pr_err("Failed to allocate fwnode for MSI IRQ domain\n");
> > > +		return -ENOMEM;
> > >  	}
> > > -	if (zdev->aibv) {
> > > -		airq_iv_release(zdev->aibv);
> > > -		zdev->aibv = NULL;
> > > +
> > > +	if (irq_delivery == FLOATING)
> > > +		zpci_msi_parent_ops.required_flags |= MSI_FLAG_NO_AFFINITY;
> > 
> > Add empty line here, so the intent is clear that the following
> > assignment is executed unconditionally.
> 
> Ok.
> 
> >    [ ... snip ... ]
> >  
> > > @@ -466,6 +551,7 @@ static int __init zpci_directed_irq_init(void)
> > >  		 * is only done on the first vector.
> > >  		 */
> > >  		zpci_ibv[cpu] = airq_iv_create(cache_line_size() * BITS_PER_BYTE,
> > > +					       AIRQ_IV_PTR |
> > >  					       AIRQ_IV_DATA |
> > >  					       AIRQ_IV_CACHELINE |
> > >  					       (!cpu ? AIRQ_IV_ALLOC : 0), NULL);
> > 
> > 
> > This looks very good to me already. Unfortunately, I was unable to
> > relieve my MSI vs. MSI-X anxiety regarding arch_restore_msi_irqs() with
> > a test since the only MSI-using PCI function (ISM) is not supporting
> > PCI auto-recovery :(
> > 
> > But a mlx5 VF now recovers just fine!
> 
> Did my expanation above help with this?

Yes, thank you. But I still would request to address the airq cleanup
in zpci_msi_domain_free().

> Thanks
> Tobias

Thanks,
Gerd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ