lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 17 Dec 2022 05:36:44 -0800
From:   Guenter Roeck <linux@...ck-us.net>
To:     Marc Zyngier <maz@...nel.org>
Cc:     Matthew Rosato <mjrosato@...ux.ibm.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Niklas Schnelle <schnelle@...ux.ibm.com>,
        LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
        Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
        linux-pci@...r.kernel.org, Bjorn Helgaas <bhelgaas@...gle.com>,
        Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Jason Gunthorpe <jgg@...lanox.com>,
        Dave Jiang <dave.jiang@...el.com>,
        Alex Williamson <alex.williamson@...hat.com>,
        Kevin Tian <kevin.tian@...el.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Logan Gunthorpe <logang@...tatee.com>,
        Ashok Raj <ashok.raj@...el.com>, Jon Mason <jdmason@...zu.us>,
        Allen Hubbe <allenbh@...il.com>,
        Michael Ellerman <mpe@...erman.id.au>,
        Nicholas Piggin <npiggin@...il.com>,
        Christophe Leroy <christophe.leroy@...roup.eu>
Subject: Re: [patch V3 09/33] genirq/msi: Add range checking to
 msi_insert_desc()

On 12/17/22 02:46, Marc Zyngier wrote:
> On Sat, 17 Dec 2022 00:45:50 +0000,
> Guenter Roeck <linux@...ck-us.net> wrote:
>>
>> On 12/16/22 05:58, Marc Zyngier wrote:
>> [ ... ]
>>
>>>> With both these fixes applied, it actually then leads to the very
>>>> next WARN_ON failing in msi_ctrl_valid...  Because ctrl->last ==
>>>> hwsize.  I think Thomas' initial fix for msi_domain_get_hwsize has
>>>> an off-by-1 error, I think we should return MSI_XA_DOMAIN_SIZE for
>>>> msi_domain_get_hwsize instead.
>>>
>>> Yes, that's a good point, and that's consistent with what
>>> __msi_create_irq_domain() does already, assuming MSI_XA_DOMAIN_SIZE
>>> when info->hwsize is 0. No reason to do something else here.
>>>
>>> I'll update Thomas' patch. Once Guenter confirms that PPC is OK, I'll
>>> send it out.
>>>
>> With
>>
>> 7a27b6136dcb (local/testing, testing-msi) genirq/msi: Return MSI_XA_DOMAIN_SIZE as the maximum MSI index when no domain is present
>> c581d525bb1d genirq/msi: Check for the presence of an irq domain when validating msi_ctrl
>> 9d33edb20f7e Merge tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>
>> I still get the following runtime warning.
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 0 PID: 8 at kernel/irq/msi.c:196 .msi_domain_free_descs+0x144/0x170
>> Modules linked in:
>> CPU: 0 PID: 8 Comm: kworker/u2:0 Tainted: G                 N 6.1.0-01957-g7a27b6136dcb #1
>> Hardware name: QEMU ppce500 e5500 0x80240020 QEMU e500
>> Workqueue: nvme-reset-wq .nvme_reset_work
>> NIP:  c000000000107d54 LR: c000000000107d44 CTR: 0000000000000000
>> REGS: c0000000041e74d0 TRAP: 0700   Tainted: G                 N  (6.1.0-01957-g7a27b6136dcb)
>> MSR:  0000000080029002 <CE,EE,ME>  CR: 44002282  XER: 20000000
>> IRQMASK: 0
>> GPR00: c000000000107d44 c0000000041e7770 c000000001629c00 c000000004e748a0
>> GPR04: 000000005358db0a c000000001ce7a00 c00000000423b5d0 000000004735aaa2
>> GPR08: 0000000000000002 0000000000000013 c00000000423acc0 c00000000214a998
>> GPR12: 0000000024002282 c000000002579000 c00000000008e190 c000000004173540
>> GPR16: 0000000000000000 c0000000043810b8 0000000000000000 0000000000000001
>> GPR20: c0000000060b22d8 c0000000060a70f0 0000000000000000 c000000001996800
>> GPR24: c0000000017df6c0 c0000000043810b8 c0000000060b2388 c0000000060b2000
>> GPR28: ffffffffffffffff c0000000041e7888 c000000006025ac8 c000000004e748a0
>> NIP [c000000000107d54] .msi_domain_free_descs+0x144/0x170
>> LR [c000000000107d44] .msi_domain_free_descs+0x134/0x170
>> Call Trace:
>> [c0000000041e7770] [c000000000107d44] .msi_domain_free_descs+0x134/0x170 (unreliable)
>> [c0000000041e7810] [c0000000001085d8] .msi_domain_free_msi_descs_range+0x38/0x70
>> [c0000000041e78a0] [c0000000008d000c] .pci_msi_teardown_msi_irqs+0x4c/0xa0
>> [c0000000041e7920] [c0000000008cf9e8] .pci_free_msi_irqs+0x18/0x50
>> [c0000000041e79a0] [c0000000008cd8d0] .pci_free_irq_vectors+0x80/0xb0
>> [c0000000041e7a20] [c000000000a6d2a0] .nvme_reset_work+0x870/0x1780
>> [c0000000041e7bb0] [c000000000080e68] .process_one_work+0x2d8/0x7b0
>> [c0000000041e7c90] [c0000000000813d8] .worker_thread+0x98/0x4f0
>> [c0000000041e7d70] [c00000000008e2cc] .kthread+0x13c/0x150
>> [c0000000041e7e10] [c0000000000005d8] .ret_from_kernel_thread+0x58/0x60
>> Instruction dump:
>> 7fc3f378 48ff1ca9 60000000 7c7f1b79 41c2002c e8810070 7fc3f378 48ff3491
>> 60000000 813f0000 2c090000 41e2ffb0 <0fe00000> 60000000 60000000 ebc10090
>> irq event stamp: 98168
>> hardirqs last  enabled at (98167): [<c00000000110a274>] ._raw_spin_unlock_irqrestore+0x84/0xd0
>> hardirqs last disabled at (98168): [<c000000000010b58>] .program_check_exception+0x38/0x120
>> softirqs last  enabled at (97760): [<c00000000110b4dc>] .__do_softirq+0x60c/0x678
>> softirqs last disabled at (97749): [<c000000000004d20>] .do_softirq_own_stack+0x30/0x50
>> ---[ end trace 0000000000000000 ]---
>> nvme nvme0: 1/0/0 default/read/poll queues
>> nvme nvme0: Ignoring bogus Namespace Identifiers
>> ...
>>
>> The system boots fine, though. This is seen when booting the ppce500
>> machine with e5500 CPU and corenet64_smp_defconfig from nvme.
> 
> +PPC folks.
> 
> Thanks for the report.
> 
> I managed to reproduce this, although in a limited way (a SMP qemu
> instance wouldn't boot at all). The problem is that the core MSI code
> now assumes that if the arch code is in charge of breaking the
> association of a MSI with a device, it is also in charge of clearing
> the irq in the MSI descriptor.
> 
> This matches the s390 behaviour, but not powerpc's, hence the splat
> and the leaked MSI descriptors. The minimal fix should be as follow,
> which I'll add to the pile of patches.
> 

Confirmed, the patch below fixes the ppc problem.

Thanks,
Guenter

> Thanks,
> 
> 	M.
> 
> diff --git a/arch/powerpc/platforms/4xx/hsta_msi.c b/arch/powerpc/platforms/4xx/hsta_msi.c
> index d4f7fff1fc87..e11b57a62b05 100644
> --- a/arch/powerpc/platforms/4xx/hsta_msi.c
> +++ b/arch/powerpc/platforms/4xx/hsta_msi.c
> @@ -115,6 +115,7 @@ static void hsta_teardown_msi_irqs(struct pci_dev *dev)
>   		msi_bitmap_free_hwirqs(&ppc4xx_hsta_msi.bmp, irq, 1);
>   		pr_debug("%s: Teardown IRQ %u (index %u)\n", __func__,
>   			 entry->irq, irq);
> +		entry->irq = 0;
>   	}
>   }
>   
> diff --git a/arch/powerpc/platforms/cell/axon_msi.c b/arch/powerpc/platforms/cell/axon_msi.c
> index 5b012abca773..0c11aad896c7 100644
> --- a/arch/powerpc/platforms/cell/axon_msi.c
> +++ b/arch/powerpc/platforms/cell/axon_msi.c
> @@ -289,6 +289,7 @@ static void axon_msi_teardown_msi_irqs(struct pci_dev *dev)
>   	msi_for_each_desc(entry, &dev->dev, MSI_DESC_ASSOCIATED) {
>   		irq_set_msi_desc(entry->irq, NULL);
>   		irq_dispose_mapping(entry->irq);
> +		entry->irq = 0;
>   	}
>   }
>   
> diff --git a/arch/powerpc/platforms/pasemi/msi.c b/arch/powerpc/platforms/pasemi/msi.c
> index dc1846660005..166c97fff16d 100644
> --- a/arch/powerpc/platforms/pasemi/msi.c
> +++ b/arch/powerpc/platforms/pasemi/msi.c
> @@ -66,6 +66,7 @@ static void pasemi_msi_teardown_msi_irqs(struct pci_dev *pdev)
>   		hwirq = virq_to_hw(entry->irq);
>   		irq_set_msi_desc(entry->irq, NULL);
>   		irq_dispose_mapping(entry->irq);
> +		entry->irq = 0;
>   		msi_bitmap_free_hwirqs(&msi_mpic->msi_bitmap, hwirq, ALLOC_CHUNK);
>   	}
>   }
> diff --git a/arch/powerpc/sysdev/fsl_msi.c b/arch/powerpc/sysdev/fsl_msi.c
> index 73c2d70706c0..57978a44d55b 100644
> --- a/arch/powerpc/sysdev/fsl_msi.c
> +++ b/arch/powerpc/sysdev/fsl_msi.c
> @@ -132,6 +132,7 @@ static void fsl_teardown_msi_irqs(struct pci_dev *pdev)
>   		msi_data = irq_get_chip_data(entry->irq);
>   		irq_set_msi_desc(entry->irq, NULL);
>   		irq_dispose_mapping(entry->irq);
> +		entry->irq = 0;
>   		msi_bitmap_free_hwirqs(&msi_data->bitmap, hwirq, 1);
>   	}
>   }
> diff --git a/arch/powerpc/sysdev/mpic_u3msi.c b/arch/powerpc/sysdev/mpic_u3msi.c
> index 1d8cfdfdf115..492cb03c0b62 100644
> --- a/arch/powerpc/sysdev/mpic_u3msi.c
> +++ b/arch/powerpc/sysdev/mpic_u3msi.c
> @@ -108,6 +108,7 @@ static void u3msi_teardown_msi_irqs(struct pci_dev *pdev)
>   		hwirq = virq_to_hw(entry->irq);
>   		irq_set_msi_desc(entry->irq, NULL);
>   		irq_dispose_mapping(entry->irq);
> +		entry->irq = 0;
>   		msi_bitmap_free_hwirqs(&msi_mpic->msi_bitmap, hwirq, 1);
>   	}
>   }
> 

Powered by blists - more mailing lists