lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201009105945.432de706.john@metanate.com>
Date:   Fri, 9 Oct 2020 10:59:45 +0100
From:   John Keeping <john@...anate.com>
To:     Vladimir Oltean <olteanv@...il.com>
Cc:     netdev@...r.kernel.org,
        Giuseppe Cavallaro <peppe.cavallaro@...com>,
        Alexandre Torgue <alexandre.torgue@...com>,
        Jose Abreu <joabreu@...opsys.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Maxime Coquelin <mcoquelin.stm32@...il.com>,
        linux-stm32@...md-mailman.stormreply.com,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] net: stmmac: Don't call _irqoff() with hardirqs enabled

On Fri, 9 Oct 2020 02:46:09 +0300
Vladimir Oltean <olteanv@...il.com> wrote:

> On Thu, Oct 08, 2020 at 05:27:49PM +0100, John Keeping wrote:
> > With threadirqs, stmmac_interrupt() is called on a thread with hardirqs
> > enabled so we cannot call __napi_schedule_irqoff().  Under lockdep it
> > leads to:
> > 
> > 	------------[ cut here ]------------
> > 	WARNING: CPU: 0 PID: 285 at kernel/softirq.c:598 __raise_softirq_irqoff+0x6c/0x1c8
> > 	IRQs not disabled as expected
> > 	Modules linked in: brcmfmac hci_uart btbcm cfg80211 brcmutil
> > 	CPU: 0 PID: 285 Comm: irq/41-eth0 Not tainted 5.4.69-rt39 #1
> > 	Hardware name: Rockchip (Device Tree)
> > 	[<c0110d3c>] (unwind_backtrace) from [<c010c284>] (show_stack+0x10/0x14)
> > 	[<c010c284>] (show_stack) from [<c0855504>] (dump_stack+0xa8/0xe0)
> > 	[<c0855504>] (dump_stack) from [<c0120a9c>] (__warn+0xe0/0xfc)
> > 	[<c0120a9c>] (__warn) from [<c0120e80>] (warn_slowpath_fmt+0x7c/0xa4)
> > 	[<c0120e80>] (warn_slowpath_fmt) from [<c01278c8>] (__raise_softirq_irqoff+0x6c/0x1c8)
> > 	[<c01278c8>] (__raise_softirq_irqoff) from [<c056bccc>] (stmmac_interrupt+0x388/0x4e0)
> > 	[<c056bccc>] (stmmac_interrupt) from [<c0178714>] (irq_forced_thread_fn+0x28/0x64)
> > 	[<c0178714>] (irq_forced_thread_fn) from [<c0178924>] (irq_thread+0x124/0x260)
> > 	[<c0178924>] (irq_thread) from [<c0142ee8>] (kthread+0x154/0x164)
> > 	[<c0142ee8>] (kthread) from [<c01010bc>] (ret_from_fork+0x14/0x38)
> > 	Exception stack(0xeb7b5fb0 to 0xeb7b5ff8)
> > 	5fa0:                                     00000000 00000000 00000000 00000000
> > 	5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > 	5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> > 	irq event stamp: 48
> > 	hardirqs last  enabled at (50): [<c085c200>] prb_unlock+0x7c/0x8c
> > 	hardirqs last disabled at (51): [<c085c0dc>] prb_lock+0x58/0x100
> > 	softirqs last  enabled at (0): [<c011e770>] copy_process+0x550/0x1654
> > 	softirqs last disabled at (25): [<c01786ec>] irq_forced_thread_fn+0x0/0x64
> > 	---[ end trace 0000000000000002 ]---
> > 
> > Use __napi_schedule() instead which will save & restore the interrupt
> > state.
> > 
> > Fixes: 4ccb45857c2c ("net: stmmac: Fix NAPI poll in TX path when in multi-queue")
> > Signed-off-by: John Keeping <john@...anate.com>
> > ---  
> 
> Don't get me wrong, this is so cool that the new lockdep warning is really
> helping out finding real bugs, but the patch that adds that warning
> (https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=cdabce2e3dff7e4bcef73473987618569d178af3)
> isn't in 5.4.69-rt39, is it?

No, it's not, although I would have saved several days debugging if it
was!  I backported the lockdep warning to prove that it caught this
issue.

The evidence it is possible to see on vanilla 5.4.x is:

	$ trace-cmd report -l
	irq/43-e-280     0....2    74.017658: softirq_raise:        vec=3 [action=NET_RX]

Note the missing "d" where this should be "0d...2" to indicate hardirqs
disabled.


Regards,
John

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ