linux-kernel - Re: [PATCH 6.10 000/809] 6.10.3-rc3 review

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87ikwf5owu.ffs@tglx>
Date: Mon, 05 Aug 2024 10:56:01 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Guenter Roeck <linux@...ck-us.net>, Greg Kroah-Hartman
 <gregkh@...uxfoundation.org>, stable@...r.kernel.org
Cc: patches@...ts.linux.dev, linux-kernel@...r.kernel.org,
 torvalds@...ux-foundation.org, akpm@...ux-foundation.org,
 shuah@...nel.org, patches@...nelci.org, lkft-triage@...ts.linaro.org,
 pavel@...x.de, jonathanh@...dia.com, f.fainelli@...il.com,
 sudipm.mukherjee@...il.com, srw@...dewatkins.net, rwarsow@....de,
 conor@...nel.org, allen.lkml@...il.com, broonie@...nel.org, "Rafael J.
 Wysocki" <rafael.j.wysocki@...el.com>, Helge Deller <deller@....de>,
 Parisc List <linux-parisc@...r.kernel.org>
Subject: Re: [PATCH 6.10 000/809] 6.10.3-rc3 review

On Sun, Aug 04 2024 at 20:28, Guenter Roeck wrote:
> On 8/4/24 11:36, Guenter Roeck wrote:
>>> Rafael J. Wysocki <rafael.j.wysocki@...el.com>
>>>      genirq: Set IRQF_COND_ONESHOT in request_irq()
>>>
>> 
>> With this patch in v6.10.3, all my parisc64 qemu tests get stuck with repeated error messages
>> 
>> [    0.000000] =============================================================================
>> [    0.000000] BUG kmem_cache_node (Not tainted): objects 21 > max 16
>> [    0.000000] -----------------------------------------------------------------------------

Do you have a full boot log? It's unclear to me at which point of the boot
process this happens. Is this before or after the secondary CPUs have
been brought up?

>> This never stops until the emulation aborts.

Do you have a recipe how to reproduce?

>> Reverting this patch fixes the problem for me.
>> 
>> I noticed a similar problem in the mainline kernel but it is either spurious there
>> or the problem has been fixed.
>> 
>
> As a follow-up, the patch below (on top of v6.10.3) "fixes" the problem for me.
> I guess that suggests some kind of race condition.
>
>
> @@ -2156,6 +2157,8 @@ int request_threaded_irq(unsigned int irq, irq_handler_t handler,
>          struct irq_desc *desc;
>          int retval;
>
> +       udelay(1);
> +
>          if (irq == IRQ_NOTCONNECTED)
>                  return -ENOTCONN;

That all makes absolutely no sense to me.

IRQF_COND_ONESHOT has only an effect on shared interrupts, when the
interrupt was already requested with IRQF_ONESHOT.

If this is really a race then the following must be true:

1) no delay

   CPU0                                 CPU1
   request_irq(IRQF_ONESHOT)
                                        request_irq(IRQF_COND_ONESHOT)

2) delay

   CPU0                                 CPU1
                                        request_irq(IRQF_COND_ONESHOT)
   request_irq(IRQF_ONESHOT)

   In this case the request on CPU 0 fails with -EBUSY ...

Confused

        tglx