lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z4k-DkMt8XQYs_kf@eichest-laptop>
Date: Thu, 16 Jan 2025 18:12:46 +0100
From: Stefan Eichenberger <eichest@...il.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: andrew@...n.ch, gregory.clement@...tlin.com,
	sebastian.hesselbarth@...il.com, shivamurthy.shastri@...utronix.de,
	anna-maria@...utronix.de, linux-arm-kernel@...ts.infradead.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1] irqchip/irq-mvebu-icu: Fix irq_set_type for sei and
 nsr

Hi Thomas,

On Wed, Jan 15, 2025 at 09:15:24AM +0100, Thomas Gleixner wrote:
> On Tue, Dec 17 2024 at 12:15, Stefan Eichenberger wrote:
> > A regression was introduced in commit d929e4db22b6
> > ("irqchip/irq-mvebu-icu: Prepare for real per device MSI") that causes
> > the Armada thermal driver to fail during probe with the following error:
> > genirq: Setting trigger mode 4 for irq 85 failed (irq_chip_set_type_parent+0x0/0x34)
> > armada_thermal f2400000.system-controller:thermal-sensor@70: Cannot request threaded IRQ 85
> > armada_thermal f2400000.system-controller:thermal-sensor@70: probe with driver armada_thermal failed with error -22
> >
> > The issue occurs because irq_set_type is assigned to
> > irq_chip_set_type_parent, but the parent IRQ chip does not implement the
> > irq_set_type operation. This causes the trigger mode configuration to
> > fail.
> >
> > This patch resolves the issue by removing the irq_set_type assignment.
> > With no irq_set_type, __irq_set_trigger safely skips the trigger
> > configuration, restoring functionality to the thermal driver.
> 
> I'm not convinced that this is correct.
> 
> The original code had irq_chip_set_type_parent() for the NSR/SEI chips
> too and all what d929e4db22b6 does is to convert those chips over to the
> new platform MSI mechanism. Here are the original chips:
> 
> static struct irq_chip mvebu_icu_nsr_chip = {
> 	.name			= "ICU-NSR",
>         ...
> 	.irq_set_type		= irq_chip_set_type_parent,
>         ...
> };
> 
> static struct irq_chip mvebu_icu_sei_chip = {
> 	.name			= "ICU-SEI",
>         ...
> 	.irq_set_type		= irq_chip_set_type_parent,
>         ...
> };
> 
> And looking at the potential platform MSI providers for MVEBU, then it
> turns out that GICP and SEI both have the irq_set_type() callback
> populated, though ODMI has not. So either this has never worked or there
> is something else fishy.
> 
> Can you please enable CONFIG_GENERIC_IRQ_DEBUGFS, build/boot a 6.10
> kernel and provide the output of
> 
> cat /sys/kernel/debug/irq/irq/$N
> 
> where $N is the interrupt number of the thermal sensor.
> 
> Then provide the same information for a current kernel with your patch
> applied.

You are right I somehow didn't look back far enough. I tested once with
kernel 6.12.5 and my patch applied:
root@...alhost:~# uname -a
Linux localhost.localdomain 6.12.5+ #157 SMP PREEMPT Thu Jan 16 17:32:21 CET 2025 aarch64 GNU/Linux
root@...alhost:~# cat /proc/interrupts |grep thermal
 35:          0          0          0          0    AP SEI  18 Level     f06f8000.system-controller:thermal-sensor@80
 90:          0          0          0          0  SEI-ICU-SEI-f21e0000.interrupt-controller:inter 116 Edge      f2400000.system-controller:thermal-sensor@70
 91:          0          0          0          0  SEI-ICU-SEI-f61e0000.interrupt-controller:inter 116 Edge      f6400000.system-controller:thermal-sensor@70
root@...alhost:~# cat /sys/kernel/debug/irq/irqs/90
handler:  handle_edge_irq
device:   f21e0000.interrupt-controller:interrupt-controller@50
status:   0x00000000
istate:   0x00004000
ddepth:   0
wdepth:   0
dstate:   0x02400204
            IRQ_TYPE_LEVEL_HIGH
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_DEFAULT_TRIGGER_SET
node:     -1
affinity: 0-3
effectiv:
domain:  :cp0:config-space@...00000:interrupt-controller@...000:interrupt-controller@...16
 hwirq:   0x74
 chip:    SEI-ICU-SEI-f21e0000.interrupt-controller:inter
  flags:   0x80
             IRQCHIP_SUPPORTS_LEVEL_MSI
 parent:
    domain:  :ap807:config-space@...00000:interrupt-controller@...200-2
     hwirq:   0x0
     chip:    CP SEI
      flags:   0x0
     parent:
        domain:  :ap807:config-space@...00000:interrupt-controller@...200-5
         hwirq:   0x15
         chip:    SEI
          flags:   0x0

And then with kernel 6.10.14 without any patches:
root@...alhost:~# uname -a
Linux localhost.localdomain 6.10.14 #1 SMP PREEMPT Thu Jan 16 17:52:59 CET 2025 aarch64 GNU/Linux
root@...alhost:~# cat /proc/interrupts |grep thermal
 35:          0          0          0          0    AP SEI  18 Level     f06f8000.system-controller:thermal-sensor@80
 90:          0          0          0          0   ICU-SEI 116 Edge      f2400000.system-controller:thermal-sensor@70
 91:          0          0          0          0   ICU-SEI 116 Edge      f6400000.system-controller:thermal-sensor@70
root@...alhost:~# cat /sys/kernel/debug/irq/irqs/90
handler:  handle_edge_irq
device:   (null)
status:   0x00000001
istate:   0x00004000
ddepth:   0
wdepth:   0
dstate:   0x02400201
            IRQ_TYPE_EDGE_RISING
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_DEFAULT_TRIGGER_SET
node:     -1
affinity: 0-3
effectiv:
domain:  :cp0:config-space@...00000:interrupt-controller@...000:interrupt-controller@50
 hwirq:   0x74
 chip:    ICU-SEI
  flags:   0x0
 parent:
    domain:  :ap807:config-space@...00000:interrupt-controller@...200-4
     hwirq:   0x505a
     chip:    SEI pMSI
      flags:   0x0
     parent:
        domain:  :ap807:config-space@...00000:interrupt-controller@...200-2
         hwirq:   0x0
         chip:    CP SEI
          flags:   0x0
         parent:
            domain:  :ap807:config-space@...00000:interrupt-controller@...200-5
             hwirq:   0x15
             chip:    SEI
              flags:   0x0

It seems with kernel 6.10 the controller device was not set correctly,
probably it was ignoring irq_set_type because of this. Do you by chance
have an idea how to properly fix this or should I do some more research?

Thanks and regards,
Stefan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ