[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ikwf5owu.ffs@tglx>
Date: Mon, 05 Aug 2024 10:56:01 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Guenter Roeck <linux@...ck-us.net>, Greg Kroah-Hartman
<gregkh@...uxfoundation.org>, stable@...r.kernel.org
Cc: patches@...ts.linux.dev, linux-kernel@...r.kernel.org,
torvalds@...ux-foundation.org, akpm@...ux-foundation.org,
shuah@...nel.org, patches@...nelci.org, lkft-triage@...ts.linaro.org,
pavel@...x.de, jonathanh@...dia.com, f.fainelli@...il.com,
sudipm.mukherjee@...il.com, srw@...dewatkins.net, rwarsow@....de,
conor@...nel.org, allen.lkml@...il.com, broonie@...nel.org, "Rafael J.
Wysocki" <rafael.j.wysocki@...el.com>, Helge Deller <deller@....de>,
Parisc List <linux-parisc@...r.kernel.org>
Subject: Re: [PATCH 6.10 000/809] 6.10.3-rc3 review
On Sun, Aug 04 2024 at 20:28, Guenter Roeck wrote:
> On 8/4/24 11:36, Guenter Roeck wrote:
>>> Rafael J. Wysocki <rafael.j.wysocki@...el.com>
>>> genirq: Set IRQF_COND_ONESHOT in request_irq()
>>>
>>
>> With this patch in v6.10.3, all my parisc64 qemu tests get stuck with repeated error messages
>>
>> [ 0.000000] =============================================================================
>> [ 0.000000] BUG kmem_cache_node (Not tainted): objects 21 > max 16
>> [ 0.000000] -----------------------------------------------------------------------------
Do you have a full boot log? It's unclear to me at which point of the boot
process this happens. Is this before or after the secondary CPUs have
been brought up?
>> This never stops until the emulation aborts.
Do you have a recipe how to reproduce?
>> Reverting this patch fixes the problem for me.
>>
>> I noticed a similar problem in the mainline kernel but it is either spurious there
>> or the problem has been fixed.
>>
>
> As a follow-up, the patch below (on top of v6.10.3) "fixes" the problem for me.
> I guess that suggests some kind of race condition.
>
>
> @@ -2156,6 +2157,8 @@ int request_threaded_irq(unsigned int irq, irq_handler_t handler,
> struct irq_desc *desc;
> int retval;
>
> + udelay(1);
> +
> if (irq == IRQ_NOTCONNECTED)
> return -ENOTCONN;
That all makes absolutely no sense to me.
IRQF_COND_ONESHOT has only an effect on shared interrupts, when the
interrupt was already requested with IRQF_ONESHOT.
If this is really a race then the following must be true:
1) no delay
CPU0 CPU1
request_irq(IRQF_ONESHOT)
request_irq(IRQF_COND_ONESHOT)
2) delay
CPU0 CPU1
request_irq(IRQF_COND_ONESHOT)
request_irq(IRQF_ONESHOT)
In this case the request on CPU 0 fails with -EBUSY ...
Confused
tglx
Powered by blists - more mailing lists