lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6eac5c37-a5a8-4ccf-aef6-62a4a0bfcea0@jevklidu.cz>
Date: Sat, 17 Aug 2024 19:57:24 +0200
From: Petr Valenta <petr@...klidu.cz>
To: "Rafael J. Wysocki" <rafael@...nel.org>, Jiri Slaby <jirislaby@...nel.org>
Cc: Len Brown <lenb@...nel.org>,
 "linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
 Linux kernel mailing list <linux-kernel@...r.kernel.org>,
 Linux regressions mailing list <regressions@...ts.linux.dev>
Subject: Re: ACPI IRQ storm with 6.10



Dne 16. 08. 24 v 20:29 Rafael J. Wysocki napsal(a):
> On Wed, Aug 14, 2024 at 8:48 AM Jiri Slaby <jirislaby@...nel.org> wrote:
>>
>> On 14. 08. 24, 7:22, Jiri Slaby wrote:
>>> Hi,
>>>
>>> one openSUSE's user reported that with 6.10, he sees one CPU under an
>>> IRQ storm from ACPI (sci_interrupt):
>>>      9:   20220768          ...  IR-IO-APIC    9-fasteoi   acpi
>>>
>>> At:
>>> https://bugzilla.suse.com/show_bug.cgi?id=1229085
>>>
>>> 6.9 was OK.
>>>
>>> With acpi.debug_level=0x08000000 acpi.debug_layer=0xffffffff, there is a
>>> repeated load of:
>>>> evgpe-0673 ev_detect_gpe         : Read registers for GPE 6D:
>>>> Status=20, Enable=00, RunEnable=4A, WakeEnable=00
>>
>> 0x6d seems to count excessively (10 snapshots every 1 second):
>>> /sys/firmware/acpi/interrupts/gpe6D:   82066  EN STS enabled      unmasked
>>> /sys/firmware/acpi/interrupts/gpe6D:   86536  EN STS enabled      unmasked
>>> /sys/firmware/acpi/interrupts/gpe6D:   90990     STS enabled      unmasked
>>> /sys/firmware/acpi/interrupts/gpe6D:   95468  EN STS enabled      unmasked
>>> /sys/firmware/acpi/interrupts/gpe6D:  100282  EN STS enabled      unmasked
>>> /sys/firmware/acpi/interrupts/gpe6D:  105187     STS enabled      unmasked
>>> /sys/firmware/acpi/interrupts/gpe6D:  110014     STS enabled      unmasked
>>> /sys/firmware/acpi/interrupts/gpe6D:  114852     STS enabled      unmasked
>>> /sys/firmware/acpi/interrupts/gpe6D:  119682     STS enabled      unmasked
>>> /sys/firmware/acpi/interrupts/gpe6D:  124194     STS enabled      unmasked
>>> /sys/firmware/acpi/interrupts/gpe6D:  128641  EN STS enabled      unmasked
>>
>> acpidump:
>> https://bugzilla.suse.com/attachment.cgi?id=876677
>>
>> DSDT:
>> https://bugzilla.suse.com/attachment.cgi?id=876678
>>
>>> Any ideas?
> 
> GPE 6D is listed in _PRW for some devices, so maybe one of them
> continues to trigger wakeup events?
> 

Disabling powertop service (which calls /usr/sbin/powertop --auto-tune) 
solves problem completely. After some search I have found this is the cause:

# causes IRQ storm on 6.10.x
# kernel 6.9.9 is immune
echo 'auto' > /sys/bus/pci/devices/0000:00:1f.6/power/control

lspci | grep 1f.6
00:1f.6 Ethernet controller: Intel Corporation Device 550b (rev 20)

journalctl -b | grep 1f.6
srp 17 19:44:17 e14 kernel: pci 0000:00:1f.6: [8086:550b] type 00 class 
0x020000 conventional PCI endpoint
srp 17 19:44:17 e14 kernel: pci 0000:00:1f.6: BAR 0 [mem 
0x9c300000-0x9c31ffff]
srp 17 19:44:17 e14 kernel: pci 0000:00:1f.6: PME# supported from D0 
D3hot D3cold
srp 17 19:44:17 e14 kernel: pci 0000:00:1f.6: Adding to iommu group 12
srp 17 19:44:19 e14 kernel: e1000e 0000:00:1f.6: Interrupt Throttling 
Rate (ints/sec) set to dynamic conservative mode
srp 17 19:44:19 e14 kernel: e1000e 0000:00:1f.6 0000:00:1f.6 
(uninitialized): registered PHC clock
srp 17 19:44:20 e14 kernel: e1000e 0000:00:1f.6 eth0: (PCI 
Express:2.5GT/s:Width x1) fc:5c:ee:b0:13:74
srp 17 19:44:20 e14 kernel: e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 
Network Connection
srp 17 19:44:20 e14 kernel: e1000e 0000:00:1f.6 eth0: MAC: 16, PHY: 12, 
PBA No: FFFFFF-0FF
srp 17 19:44:20 e14 kernel: e1000e 0000:00:1f.6 enp0s31f6: renamed from eth0
srp 17 19:44:24 e14 ModemManager[1434]: <info>  [base-manager] couldn't 
check support for device '/sys/devices/pci0000:00/0000:00:1f.6': not 
supported by any plugin



> You can ask the reporter to mask that GPE via "echo mask >
> /sys/firmware/acpi/interrupts/gpe6D" and see if the storm goes away
> then.
> 
> The only ACPI core issue introduced between 6.9 and 6.10 I'm aware of
> is the one addressed by this series
> 
> https://lore.kernel.org/linux-acpi/22385894.EfDdHjke4D@rjwysocki.net/
> 
> but this is about the EC and the problem here doesn't appear to be
> EC-related.  It may be worth trying anyway, though.
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ