[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ynpzb5L53iGex14D@FVFF77S0Q05N>
Date: Tue, 10 May 2022 15:15:11 +0100
From: Mark Rutland <mark.rutland@....com>
To: Lukas Wunner <lukas@...ner.de>
Cc: maz@...nel.org, Linus Walleij <linus.walleij@...aro.org>,
Bartosz Golaszewski <brgl@...ev.pl>,
linux-gpio@...r.kernel.org,
Octavian Purdila <octavian.purdila@....com>,
linux-kernel@...r.kernel.org, aou@...s.berkeley.edu,
catalin.marinas@....com, deanbo422@...il.com, green.hu@...il.com,
guoren@...nel.org, jonas@...thpole.se, kernelfans@...il.com,
linux-arm-kernel@...ts.infradead.org, linux@...linux.org.uk,
nickhu@...estech.com, palmer@...belt.com, paul.walmsley@...ive.com,
shorne@...il.com, stefan.kristiansson@...nalahti.fi,
tglx@...utronix.de, tsbogend@...ha.franken.de, vgupta@...nel.org,
vladimir.murzin@....com, will@...nel.org
Subject: Re: [PATCH v2 17/17] irq: remove handle_domain_{irq,nmi}()
On Tue, May 10, 2022 at 02:13:20PM +0200, Lukas Wunner wrote:
> On Mon, May 09, 2022 at 09:54:21AM +0100, Mark Rutland wrote:
> > On Fri, May 06, 2022 at 10:32:42PM +0200, Lukas Wunner wrote:
> > > On Tue, Oct 26, 2021 at 10:25:04AM +0100, Mark Rutland wrote:
> > > > int generic_handle_domain_irq(struct irq_domain *domain, unsigned int hwirq)
> > > > {
> > > > + WARN_ON_ONCE(!in_irq());
> > > > return handle_irq_desc(irq_resolve_mapping(domain, hwirq));
> > > > }
> > > > EXPORT_SYMBOL_GPL(generic_handle_domain_irq);
> > >
> > > Why isn't the WARN_ON_ONCE() conditional on handle_enforce_irqctx()?
> > > (See handle_irq_desc() and c16816acd086.)
> >
> > I did this for consistency with the in_nmi() check in
> > generic_handle_domain_nmi(); I was unaware of commit c16816acd086 and
> > IRQD_HANDLE_ENFORCE_IRQCTX.
>
> Actually, since you're mentioning the in_nmi() check, I suspect
> there's another problem here:
>
> generic_handle_domain_nmi() warns if !in_nmi(), then calls down
> to handle_irq_desc() which warns if !in_hardirq(). Doesn't this
> cause a false-positive !in_hardirq() warning for a NMI on GIC/GICv3?
I agree that doesn't look right.
> The only driver calling request_nmi() or request_percpu_nmi() is
> drivers/perf/arm_pmu.c. So that's the only one affected.
> You may want to test if that driver indeed exhibits such a
> false-positive warning since c16816acd086.
In testing with v5.18-rc5, I can't see that going wrong.
I also hacked the following in:
-------->8--------
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 939d21cd55c38..3c85608a8779f 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -718,6 +718,7 @@ EXPORT_SYMBOL_GPL(generic_handle_domain_irq);
int generic_handle_domain_nmi(struct irq_domain *domain, unsigned int hwirq)
{
WARN_ON_ONCE(!in_nmi());
+ WARN_ON_ONCE(!in_hardirq());
return handle_irq_desc(irq_resolve_mapping(domain, hwirq));
}
#endif
-------->8--------
... and that never fired, which doesn't seem right.
I also see some other problems which may or may not be related (lockdep
tracking gone wrong, which could indicate a more general failure to track irq
state correctly). I'll go dig into that...
> > > I believe the above change causes a regression in drivers/gpio/gpio-dln2.c
> > > such that a gratuitous WARN splat is now emitted.
> >
> > Do you mean you beleive that from inspection, or are you actually seeing this
> > go wrong in practice?
> >
> > If it's the latter, are you able to give a copy of the dmesg splat?
>
> For gpio-dln2.c, I believe it from inspection.
>
> For smsc95xx.c, I'm actually seeing it go wrong in practice,
> unedited dmesg splat is included below FWIW.
Thanks; having the trace makes this much easier to analyse.
Mark.
> That's with the following series applied on top of current net-next:
> https://lore.kernel.org/netdev/cover.1651574194.git.lukas@wunner.de/
>
> You need to remove the __irq_enter_raw() in patch [5/7] of that series
> to actually see the WARN splat. I used it to work around your
> WARN_ON_ONCE() and that's what Marc took exception to.
>
> With gpio-dln2.c, the call stack looks like this:
>
> dln2_rx() # drivers/mfd/dln2.c
> dln2_run_event_callbacks()
> dln2_gpio_event() # drivers/gpio/gpio-dln2.c
> generic_handle_domain_irq()
>
> That's basically the same as the callstack for smsc95xx.c below.
> Interrupts are disabled in dln2_rx() via spin_lock_irqsave(),
> but there isn't an __irq_enter_raw() anywhere to be seen
> and that would be necessary to avoid the false-positive
> WARN splat in generic_handle_domain_irq().
>
> Thanks,
>
> Lukas
>
> -- >8 --
>
> [ 1227.718928] WARNING: CPU: 3 PID: 75 at kernel/irq/irqdesc.c:702 generic_handle_domain_irq+0x88/0x94
> [ 1227.718951] Modules linked in: sha256_generic cfg80211 rfkill 8021q garp stp llc ax88796b asix raspberrypi_hwmon bcm2835_codec(C) bcm2835_v4l2(C) bcm2835_isp(C) v4l2_mem2mem bcm2835_mmal_vchiq(C) snd_bcm2835(C) vc_sm_cma(C) videobuf2_dma_contig videobuf2_vmalloc snd_pcm videobuf2_memops videobuf2_v4l2 videobuf2_common snd_timer snd videodev mc micrel ks8851_spi ks8851_common uio_pdrv_genirq uio eeprom_93cx6 piControl(O) ad5446 ti_dac082s085 mcp320x iio_mux mux_gpio mux_core fixed gpio_74x164 spi_bcm2835aux spi_bcm2835 gpio_max3191x crc8 industrialio i2c_dev ip_tables x_tables ipv6
> [ 1227.719076] CPU: 3 PID: 75 Comm: irq/89-dwc_otg_ Tainted: G C O 5.17.0-rt15-v7+ #2
> [ 1227.719084] Hardware name: BCM2835
> [ 1227.719087] Backtrace:
> [ 1227.719091] dump_backtrace from show_stack+0x20/0x24
> [ 1227.719106] r7:00000009 r6:00000080 r5:80d25844 r4:60000093
> [ 1227.719109] show_stack from dump_stack_lvl+0x74/0x9c
> [ 1227.719120] dump_stack_lvl from dump_stack+0x18/0x1c
> [ 1227.719134] r7:00000009 r6:801957f4 r5:000002be r4:80d1e8ac
> [ 1227.719137] dump_stack from __warn+0xdc/0x190
> [ 1227.719148] __warn from warn_slowpath_fmt+0x70/0xcc
> [ 1227.719161] r8:00000009 r7:801957f4 r6:000002be r5:80d1e8ac r4:00000000
> [ 1227.719163] warn_slowpath_fmt from generic_handle_domain_irq+0x88/0x94
> [ 1227.719177] r8:81e87600 r7:00000000 r6:81d28000 r5:00000000 r4:81d4be00
> [ 1227.719179] generic_handle_domain_irq from smsc95xx_status+0x54/0xb0
> [ 1227.719195] r7:00000000 r6:00008000 r5:81f28640 r4:60000013
> [ 1227.719197] smsc95xx_status from intr_complete+0x80/0x84
> [ 1227.719213] r9:81d40a00 r8:00000001 r7:00000000 r6:00000000 r5:81f28640 r4:81d4bd80
> [ 1227.719216] intr_complete from __usb_hcd_giveback_urb+0xa4/0x12c
> [ 1227.719229] r5:815b3000 r4:81d4bd80
> [ 1227.719232] __usb_hcd_giveback_urb from usb_hcd_giveback_urb+0x118/0x11c
> [ 1227.719245] r7:811498e0 r6:81d4bd80 r5:81682000 r4:815b3000
> [ 1227.719248] usb_hcd_giveback_urb from completion_tasklet_func+0x7c/0xc8
> [ 1227.719262] r7:811498e0 r6:81d4bd80 r5:81682000 r4:8639f800
> [ 1227.719264] completion_tasklet_func from tasklet_callback+0x20/0x24
> [ 1227.719277] r6:80f8d140 r5:b6b5c330 r4:00000000
> [ 1227.719279] tasklet_callback from tasklet_action_common.constprop.0+0x148/0x220
> [ 1227.719288] tasklet_action_common.constprop.0 from tasklet_hi_action+0x28/0x30
> [ 1227.719301] r10:81d28000 r9:00000001 r8:00000000 r7:811498e0 r6:00000000 r5:00000001
> [ 1227.719304] r4:81003080
> [ 1227.719306] tasklet_hi_action from __do_softirq+0x154/0x3e8
> [ 1227.719316] __do_softirq from __local_bh_enable_ip+0x12c/0x1a8
> [ 1227.719328] r10:801970b0 r9:80f85320 r8:00000001 r7:00000000 r6:00000001 r5:00000100
> [ 1227.719332] r4:40000013
> [ 1227.719334] __local_bh_enable_ip from irq_forced_thread_fn+0x7c/0xac
> [ 1227.719347] r9:81d40ac0 r8:814f3c00 r7:00000001 r6:00000001 r5:814f3c00 r4:81d40ac0
> [ 1227.719350] irq_forced_thread_fn from irq_thread+0x16c/0x228
> [ 1227.719363] r7:00000001 r6:81d40ae4 r5:81d28000 r4:00000000
> [ 1227.719365] irq_thread from kthread+0x100/0x140
> [ 1227.719380] r10:00000000 r9:8154fbfc r8:81d43000 r7:81d40ac0 r6:80196dcc r5:81d40b00
> [ 1227.719383] r4:81d28000
> [ 1227.719386] kthread from ret_from_fork+0x14/0x34
> [ 1227.719394] Exception stack(0x81777fb0 to 0x81777ff8)
> [ 1227.719401] 7fa0: 00000000 00000000 00000000 00000000
> [ 1227.719408] 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [ 1227.719414] 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [ 1227.719422] r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:8014a6c8 r4:81d40b00
> [ 1227.719425] ---[ end trace 0000000000000000 ]---
Powered by blists - more mailing lists