linux-kernel - Re: Affinity setting problem for emulated MSI on PLIC

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID:
 <MAUPR01MB1107281F1A920C6DBAC30CE1CFE33A@MAUPR01MB11072.INDPRD01.PROD.OUTLOOK.COM>
Date: Wed, 20 Aug 2025 15:40:07 +0800
From: Chen Wang <unicorn_wang@...look.com>
To: Inochi Amaoto <inochiama@...il.com>, Thomas Gleixner <tglx@...utronix.de>
Cc: linux-kernel@...r.kernel.org, Anup Patel <anup@...infault.org>,
 Paul Walmsley <paul.walmsley@...ive.com>,
 Samuel Holland <samuel.holland@...ive.com>, Marc Zyngier <maz@...nel.org>,
 Nam Cao <namcao@...utronix.de>
Subject: Re: Affinity setting problem for emulated MSI on PLIC

Sorry for the late reply, I just found the time to read this email recently.

On 7/25/2025 6:25 AM, Inochi Amaoto wrote:
> On Thu, Jul 24, 2025 at 01:07:41PM +0200, Thomas Gleixner wrote:
>> On Thu, Jul 24 2025 at 09:34, Inochi Amaoto wrote:
>>> On Thu, Jul 24, 2025 at 12:50:05AM +0200, Thomas Gleixner wrote:
>>>> May I ask the obvious question:
>>>>
>>>>      How did this obviously disfunctional driver gain Tested-by and other
>>>>      relevant tags?
>>> I think the SG2042 pci driver does not support affinity setting, so it
>>> is ignored. But the detail thing I will cc Chen Wang. I guess he can give
>>> some details.
>> It does not matter whether the PCI part supports it or not.
>>
>> PLIC definitely supports it and if the routing entry is not set up, then
>> there is no way that an interrupt is delivered. As the routing entry
>> write is delayed on startup until irq_enable() is invoked, this never
>> happens because of PCI/MSI not having a irq_enable() callback.
>>
Indeed, the SG2042 PCIe driver doesn't setup `.irq_set_affinity` 
callback. After configuring the domain inheritance in the sg204x-msi 
driver, the domain inheritance becomes as follows:


plic <- sg204x-msi <- device-msi


When the device requests an interrupt, the sg204x-msi's 
`.irq_set_affinity` callback is automatically triggered. This is because 
`.irq_set_affinity = irq_chip_set_affinity_parent,` is set in 
sg204x-msi, triggering the `plic_set_affinity()`.

Part of callstack dumping is as follows:

[    0.941653] [<ffffffff800099ee>] plic_set_affinity+0x1e/0xc0
[    0.941660] [<ffffffff8009b576>] irq_chip_set_affinity_parent+0x12/0x20
[    0.941666] [<ffffffff800a2c14>] msi_domain_set_affinity+0x2c/0x86
[    0.941674] [<ffffffff800993cc>] irq_do_set_affinity+0xce/0x1c2
[    0.941681] [<ffffffff80099992>] irq_setup_affinity+0x110/0x1d8
[    0.941687] [<ffffffff8009ca72>] irq_startup+0xd6/0x10a
[    0.941692] [<ffffffff8009a270>] __setup_irq+0x51c/0x5ec
[    0.941698] [<ffffffff8009a3f6>] request_threaded_irq+0xb6/0x160
[    0.941704] [<ffffffff8060bf22>] pci_request_irq+0x70/0xba
[    0.941710] [<ffffffff80902354>] queue_request_irq+0x64/0x6c
[    0.941718] [<ffffffff809034c4>] nvme_setup_io_queues+0x254/0x688
[    0.941724] [<ffffffff80905ccc>] nvme_probe+0x6a4/0x752

...

In `plic_set_affinity()`, it will

-> plic_irq_disable
-> plic_irq_enable
    -> plic_irq_toggle
    -> plic_irq_unmask

I think that make irq working and that's why I didn't find problem on 
SG2042.


> You are right. As I debug this problem, some interrupts are enabled when
> entering irq_set_affinity(). And it does not have IRQD_AFFINITY_MANAGED
> flag. So I think the problem is covered by this: the plic_set_affinity()
> enables the irq. As these irqs are enabled in an unexpected path, I have
> noticed the problem before.
I suspect this issue is related to the setting of 
`msi_parent_ops.supported_flags` in the sg204x-msi driver. The SG2042 
doesn't have MSI_FLAG_PCI_MSIX set, while the SG2044 does. This leads to 
different behavior when requesting interrupts for the PCI devices. For 
the same NVMe device, the SG2042 only requests a single MSI interrupt, 
while the SG2044 requests multiple. The first interrupt is 
IRQ_STARTUP_NORMAL for both, but from the second onwards it becomes 
IRQ_STARTUP_MANAGED. This results in different paths in `irq_startup()`. 
Simply put, normal interrupts go through `__irq_startup()`, then 
`irq_setup_affinity()`, while managed interrupts go through 
`irq_do_set_affinity()` first, then `__irq_startup()`. The SG2042 
doesn't have managed interrupts, so I didn't see the issue. I suspect 
the SG2044 does, which is causing the issue you're seeing.


So it might be worth investigating why the two different processing 
paths in `irq_startup()` cause this problem. I saw that Inochi submitted 
a patch titled "irqchip/sifive-plic: Respect mask state when setting 
affinity." Can we assume that simply applying this patch will resolve 
the current issues with the SG2044? Do we also need to consider 
introducing changes to the startup/shutdown functions of the plic?


Also, I agree MSI_FLAG_PCI_MSI_MASK_PARENT should be added otherwise, 
plic_irq_mask/plic_irq_unmask may be missed in other code path.


I read it hastily, just some immature ideas, welcome to discuss.


Regards,

Chen


>>> For SG2044, I have tested at old version and it worked when submitting.
>>> And I guess it is because the commit [1], which remove the irq_set_affinity.
>>>
>>> [1] https://lore.kernel.org/r/20240723132958.41320-6-marek.vasut+renesas@mailbox.org
>>>
>>> IIRC, the worked version I tested has no affinity set and all irqs
>>> are routed to 0, which is different from the behavior now. Another
>> That does not make any sense. What sets the routing entry for CPU0?
>>
>> This really needs a coherent explanation. Works by chance is not an
>> explanation at all :)
>>
> Yeah, I know. I did not dig in when it is worked. This does teach me a
> big lesson this time....
>
> As the problem is covered by the plic_set_affinity, I think it may be
> caused by the same problem. Routing to CPU0 is not the real reason,
> setting affinity after enable does this trick for the problem.
>
> Regards,
> Inochi
>