[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240219204421.2f6019c1@meshulam.tesarici.cz>
Date: Mon, 19 Feb 2024 20:44:21 +0100
From: Petr Tesařík <petr@...arici.cz>
To: Christian Stewart <christian@...rture.us>
Cc: Marc Haber <mh+netdev@...schlus.de>, Florian Fainelli
<f.fainelli@...il.com>, Andrew Lunn <andrew@...n.ch>,
alexandre.torgue@...s.st.com, Jose Abreu <joabreu@...opsys.com>, Chen-Yu
Tsai <wens@...e.org>, Jernej Skrabec <jernej.skrabec@...il.com>, Samuel
Holland <samuel@...lland.org>, Jisheng Zhang <jszhang@...nel.org>,
netdev@...r.kernel.org
Subject: Re: stmmac on Banana PI CPU stalls since Linux 6.6
On Mon, 19 Feb 2024 11:20:35 -0800
Christian Stewart <christian@...rture.us> wrote:
> Hi all,
>
> On Mon, Feb 12, 2024 at 4:15 AM Marc Haber <mh+netdev@...schlus.de> wrote:
> >
> > On Tue, Feb 06, 2024 at 09:23:51AM +0100, Petr Tesařík wrote:
> > > On Mon, 5 Feb 2024 13:50:35 -0800
> > > Florian Fainelli <f.fainelli@...il.com> wrote:
> > >
> > > > On 2/5/24 12:12, Marc Haber wrote:
> > > > > On Fri, Jan 26, 2024 at 12:10:28PM +0100, Petr Tesařík wrote:
> > > > >> Then you may want to start by verifying that it is indeed the same
> > > > >> issue. Try the linked patch.
> > > > >
> > > > > The linked patch seemed to help for 6.7.2, the test machine ran for five
> > > > > days without problems. After going to unpatched 6.7.2, the issue was
> > > > > back in six hours.
> > > >
> > > > Do you mind responding to Petr's patch with a Tested-by? Thanks!
> > >
> > > I believe Marc tested my first attempt at a solution (the one with
> > > spinlocks), not the latest incarnation. FWIW I have tested a similar
> > > scenario, with similar results.
> >
> > Where is the latest patch? I can give it a try.
> >
> > Sorry for not responding any earlier, February 10 is an important tax
> > due date in Germany.
> >
> > Greetings
> > Marc
>
> We are seeing the same kernel panic on shutdown with 6.7.4 on a
> BananaPi M2 Ultra:
>
> [** ] (3 of 3) A stop job is running for Network Manager (33s / 52s)
> [ 259.463772] rcu: INFO: rcu_sched self-detected stall on CPU
> [ 259.469388] rcu: 0-....: (2099 ticks this GP)
> idle=0fdc/1/0x40000002 softirq=12003/12003 fqs=1034
> [ 259.478360] rcu: (t=2100 jiffies g=16277 q=36 ncpus=4)
> [ 259.483595] CPU: 0 PID: 4462 Comm: ip Tainted: G C 6.7.4 #1
> [ 259.490562] Hardware name: Allwinner sun8i Family
> [ 259.495268] PC is at stmmac_get_stats64+0x30/0x198
> [ 259.500081] LR is at dev_get_stats+0x3c/0x160
> [ 259.504445] pc : [<c06b9924>] lr : [<c07bf7a8>] psr: 200f0013
> [ 259.510712] sp : f1e6d9b8 ip : c3ca478c fp : c23e0000
> [ 259.515941] r10: 00000000 r9 : c3ca4598 r8 : 00000000
> [ 259.521168] r7 : 00000001 r6 : 00000000 r5 : c23e3000 r4 : 00000001
> [ 259.527697] r3 : 00005c1b r2 : c23e2e08 r1 : c3ca46c4 r0 : c23e0000
> [ 259.534226] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
> [ 259.541363] Control: 10c5387d Table: 429cc06a DAC: 00000051
> [ 259.547117] stmmac_get_stats64 from dev_get_stats+0x3c/0x160
> [ 259.552882] dev_get_stats from rtnl_fill_stats+0x30/0x118
> [ 259.552899] rtnl_fill_stats from rtnl_fill_ifinfo+0x720/0x135c
> [ 259.564306] rtnl_fill_ifinfo from rtnl_dump_ifinfo+0x330/0x6a8
> [ 259.570240] rtnl_dump_ifinfo from netlink_dump+0x16c/0x350
> [ 259.575830] netlink_dump from __netlink_dump_start+0x1bc/0x280
> [ 259.581766] __netlink_dump_start from rtnetlink_rcv_msg+0xf4/0x2f0
> [ 259.588047] rtnetlink_rcv_msg from netlink_rcv_skb+0xb8/0x118
> [ 259.593893] netlink_rcv_skb from netlink_unicast+0x1fc/0x2d8
> [ 259.599655] netlink_unicast from netlink_sendmsg+0x1c8/0x440
> [ 259.605416] netlink_sendmsg from sock_write_iter+0xa0/0x10c
> [ 259.611094] sock_write_iter from vfs_write+0x338/0x398
> [ 259.616334] vfs_write from ksys_write+0xbc/0xf0
> [ 259.620961] ksys_write from ret_fast_syscall+0x0/0x54
> [ 259.626110] Exception stack(0xf1e6dfa8 to 0xf1e6dff0)
> [ 259.631169] dfa0: 00000003 be997dd8 00000003
> be997dd8 00000014 00000001
> [ 259.639351] dfc0: 00000003 be997dd8 00000014 00000004 00519548
> be997e08 b6fd0ce0 0051783c
>
> https://github.com/skiffos/SkiffOS/issues/307
>
> I'm writing to ask if anyone has found a fix for this yet?
If you're running a 6.7 stable kernel, my patch has just been added to
the 6.7-stable tree.
https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-6.7/net-stmmac-protect-updates-of-64-bit-statistics-counters.patch
However, lockdep has reported an issue with it:
https://lore.kernel.org/lkml/ea1567d9-ce66-45e6-8168-ac40a47d1821@roeck-us.net/
This new report has not yet been properly understood, but FWIW I've
been running stable with my patch for over a month now.
Petr T
Powered by blists - more mailing lists