linux-kernel - Re: New kernel warning after updating from LTS 5.15.110 to 5.15.112 (and 5.15.113)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <03daca5c-e468-8889-4dc2-e625a664d571@alliedtelesis.co.nz>
Date:   Fri, 2 Jun 2023 04:19:39 +0000
From:   Chris Packham <Chris.Packham@...iedtelesis.co.nz>
To:     Bagas Sanjaya <bagasdotme@...il.com>,
        Lino Sanfilippo <l.sanfilippo@...bus.com>,
        Jarkko Sakkinen <jarkko@...nel.org>,
        Sasha Levin <sashal@...nel.org>,
        Peter Huewe <peterhuewe@....de>, Jason Gunthorpe <jgg@...pe.ca>
CC:     Linux Kernel Integrity <linux-integrity@...r.kernel.org>,
        "Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
        Linux Regressions <regressions@...ts.linux.dev>
Subject: Re: New kernel warning after updating from LTS 5.15.110 to 5.15.112
 (and 5.15.113)


On 2/06/23 16:10, Bagas Sanjaya wrote:
> On 5/29/23 09:37, Chris Packham wrote:
>> On 29/05/23 14:04, Bagas Sanjaya wrote:
>>> On Sun, May 28, 2023 at 11:42:50PM +0000, Chris Packham wrote:
>>>> Hi,
>>>>
>>>> We have an embedded product with an Infineon SLM9670 TPM. After updating
>>>> to a newer LTS kernel version we started seeing the following warning at
>>>> boot.
>>>>
>>>> [    4.741025] ------------[ cut here ]------------
>>>> [    4.749894] irq 38 handler tis_int_handler+0x0/0x154 enabled interrupts
>>>> [    4.756555] WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:159
>>>> __handle_irq_event_percpu+0xf4/0x180
>>>> [    4.765557] Modules linked in:
>>>> [    4.768626] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.15.113 #1
>>>> [    4.774747] Hardware name: Allied Telesis x250-18XS (DT)
>>>> [    4.780080] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS
>>>> BTYPE=--)
>>>> [    4.787072] pc : __handle_irq_event_percpu+0xf4/0x180
>>>> [    4.792146] lr : __handle_irq_event_percpu+0xf4/0x180
>>>> [    4.797220] sp : ffff800008003e40
>>>> [    4.800547] x29: ffff800008003e40 x28: ffff8000093951c0 x27:
>>>> ffff80000902a9b8
>>>> [    4.807716] x26: ffff800008fe8d28 x25: ffff8000094a62bd x24:
>>>> ffff000001b92400
>>>> [    4.814885] x23: 0000000000000026 x22: ffff800008003ec4 x21:
>>>> 0000000000000000
>>>> [    4.822053] x20: 0000000000000001 x19: ffff000002381200 x18:
>>>> ffffffffffffffff
>>>> [    4.829222] x17: ffff800076962000 x16: ffff800008000000 x15:
>>>> ffff800088003b57
>>>> [    4.836390] x14: 0000000000000000 x13: ffff8000093a5078 x12:
>>>> 000000000000035d
>>>> [    4.843558] x11: 000000000000011f x10: ffff8000093a5078 x9 :
>>>> ffff8000093a5078
>>>> [    4.850727] x8 : 00000000ffffefff x7 : ffff8000093fd078 x6 :
>>>> ffff8000093fd078
>>>> [    4.857895] x5 : 000000000000bff4 x4 : 0000000000000000 x3 :
>>>> 0000000000000000
>>>> [    4.865062] x2 : 0000000000000000 x1 : 0000000000000000 x0 :
>>>> ffff8000093951c0
>>>> [    4.872230] Call trace:
>>>> [    4.874686]  __handle_irq_event_percpu+0xf4/0x180
>>>> [    4.879411]  handle_irq_event+0x64/0xec
>>>> [    4.883264]  handle_level_irq+0xc0/0x1b0
>>>> [    4.887202]  generic_handle_irq+0x30/0x50
>>>> [    4.891229]  mvebu_gpio_irq_handler+0x11c/0x2a0
>>>> [    4.895780]  handle_domain_irq+0x60/0x90
>>>> [    4.899720]  gic_handle_irq+0x4c/0xd0
>>>> [    4.903398]  call_on_irq_stack+0x20/0x4c
>>>> [    4.907338]  do_interrupt_handler+0x54/0x60
>>>> [    4.911538]  el1_interrupt+0x30/0x80
>>>> [    4.915130]  el1h_64_irq_handler+0x18/0x24
>>>> [    4.919244]  el1h_64_irq+0x78/0x7c
>>>> [    4.922659]  arch_cpu_idle+0x18/0x2c
>>>> [    4.926249]  do_idle+0xc4/0x150
>>>> [    4.929404]  cpu_startup_entry+0x28/0x60
>>>> [    4.933343]  rest_init+0xe4/0xf4
>>>> [    4.936584]  arch_call_rest_init+0x10/0x1c
>>>> [    4.940699]  start_kernel+0x600/0x640
>>>> [    4.944375]  __primary_switched+0xbc/0xc4
>>>> [    4.948402] ---[ end trace 940193047b35b311 ]---
>>>>
>>>> Initially I dismissed this as a warning that would probably be cleaned
>>>> up when we did more work on the TPM support for our product but we also
>>>> seem to be getting some new i2c issues and possibly a kernel stack
>>>> corruption that we've conflated with this TPM warning.
>>> Can you reproduce this issue on mainline? Can you also bisect to find
>>> the culprit?
>> No the error doesn't appear on a recent mainline kernel. I do still get
>>
>> tpm_tis_spi spi1.1: 2.0 TPM (device-id 0x1B, rev-id 22)
>> tpm tpm0: [Firmware Bug]: TPM interrupt not working, polling instead
>> tpm tpm0: A TPM error (256) occurred attempting the self test
>>
>> but I think I was getting that on v5.15.110
>>
> I repeat: Can you bisect between v5.15 and v5.15.112?

It's definitely between v5.15.110 and v5.15.112.

I'll do a proper bisect next week but I'm pretty sure it's related to 
the "tpm, tpm_tis:" series. The problem can be worked around by removing 
the TPM interrupt from the device tree for the board.