linux-kernel - Re: [PATCH 6.10 000/809] 6.10.3-rc3 review

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0ad9d0db-df2f-4e35-b53c-ed23cb2dc42d@roeck-us.net>
Date: Mon, 5 Aug 2024 10:42:53 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: Thomas Gleixner <tglx@...utronix.de>,
 Greg Kroah-Hartman <gregkh@...uxfoundation.org>, stable@...r.kernel.org
Cc: patches@...ts.linux.dev, linux-kernel@...r.kernel.org,
 torvalds@...ux-foundation.org, akpm@...ux-foundation.org, shuah@...nel.org,
 patches@...nelci.org, lkft-triage@...ts.linaro.org, pavel@...x.de,
 jonathanh@...dia.com, f.fainelli@...il.com, sudipm.mukherjee@...il.com,
 srw@...dewatkins.net, rwarsow@....de, conor@...nel.org,
 allen.lkml@...il.com, broonie@...nel.org,
 "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
 Helge Deller <deller@....de>, Parisc List <linux-parisc@...r.kernel.org>
Subject: Re: [PATCH 6.10 000/809] 6.10.3-rc3 review

On 8/5/24 01:56, Thomas Gleixner wrote:
> On Sun, Aug 04 2024 at 20:28, Guenter Roeck wrote:
>> On 8/4/24 11:36, Guenter Roeck wrote:
>>>> Rafael J. Wysocki <rafael.j.wysocki@...el.com>
>>>>       genirq: Set IRQF_COND_ONESHOT in request_irq()
>>>>
>>>
>>> With this patch in v6.10.3, all my parisc64 qemu tests get stuck with repeated error messages
>>>
>>> [    0.000000] =============================================================================
>>> [    0.000000] BUG kmem_cache_node (Not tainted): objects 21 > max 16
>>> [    0.000000] -----------------------------------------------------------------------------
> 
> Do you have a full boot log? It's unclear to me at which point of the boot
> process this happens. Is this before or after the secondary CPUs have
> been brought up?
> 
>>> This never stops until the emulation aborts.
> 
> Do you have a recipe how to reproduce?
> 
>>> Reverting this patch fixes the problem for me.
>>>
>>> I noticed a similar problem in the mainline kernel but it is either spurious there
>>> or the problem has been fixed.
>>>
>>
>> As a follow-up, the patch below (on top of v6.10.3) "fixes" the problem for me.
>> I guess that suggests some kind of race condition.
>>
>>
>> @@ -2156,6 +2157,8 @@ int request_threaded_irq(unsigned int irq, irq_handler_t handler,
>>           struct irq_desc *desc;
>>           int retval;
>>
>> +       udelay(1);
>> +
>>           if (irq == IRQ_NOTCONNECTED)
>>                   return -ENOTCONN;
> 
> That all makes absolutely no sense to me.
> 

Same here, really. I can reproduce the problem with v6.10.3, using my configuration,
but whatever debugging I add makes the problem disappear. I had seen the same problem
on mainline with v6.11-rc1-272-g17712b7ea075. Log is at
https://kerneltests.org/builders/qemu-parisc64-master/builds/168/steps/qemubuildcommand/logs/stdio
However, I can no longer reproduce it there. What makes it even more weird / odd
is that I can bisect the problem between v6.10.2 and v6.10.3 and it points to this
commit, but reproducing it outside that chain seems to be all but impossible.

Guenter

> IRQF_COND_ONESHOT has only an effect on shared interrupts, when the
> interrupt was already requested with IRQF_ONESHOT.
> 
> If this is really a race then the following must be true:
> 
> 1) no delay
> 
>     CPU0                                 CPU1
>     request_irq(IRQF_ONESHOT)
>                                          request_irq(IRQF_COND_ONESHOT)
> 
> 2) delay
> 
>     CPU0                                 CPU1
>                                          request_irq(IRQF_COND_ONESHOT)
>     request_irq(IRQF_ONESHOT)
> 
>     In this case the request on CPU 0 fails with -EBUSY ...
> 
> Confused
> 
>          tglx
> 
>