[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <06de0002-c3c6-4f13-9618-066cb9658240@alliedtelesis.co.nz>
Date: Wed, 24 Apr 2024 07:42:17 +0000
From: Chris Packham <Chris.Packham@...iedtelesis.co.nz>
To: Jeff Layton <jlayton@...nel.org>, Chuck Lever <chuck.lever@...cle.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>, "neilb@...e.de"
<neilb@...e.de>
CC: "linux-nfs@...r.kernel.org" <linux-nfs@...r.kernel.org>, netdev
<netdev@...r.kernel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "stable@...r.kernel.org"
<stable@...r.kernel.org>
Subject: Re: kernel BUG at net/sunrpc/svc.c:570 after updating from v5.15.153
to v5.15.155
On 24/04/24 13:38, Chris Packham wrote:
>
> On 24/04/24 12:54, Chris Packham wrote:
>> Hi Jeff, Chuck, Greg,
>>
>> After updating one of our builds along the 5.15.y LTS branch our
>> testing caught a new kernel bug. Output below.
>>
>> I haven't dug into it yet but wondered if it rang any bells.
>
> A bit more info. This is happening at "reboot" for us. Our embedded
> devices use a bit of a hacked up reboot process so that they come back
> faster in the case of a failure.
>
> It doesn't happen with a proper `systemctl reboot` or with a SYSRQ+B
>
> I can trigger it with `killall -9 nfsd` which I'm not sure is a
> completely legit thing to do to kernel threads but it's probably close
> to what our customized reboot does.
I've bisected between v5.15.153 and v5.15.155 and identified commit
dec6b8bcac73 ("nfsd: Simplify code around svc_exit_thread() call in
nfsd()") as the first bad commit. Based on the context that seems to
line up with my reproduction. I'm wondering if perhaps something got
missed out of the stable track? Unfortunately I'm not able to run a more
recent kernel with all of the nfs related setup that is being used on
the system in question.
>
>>
>> Thanks,
>> Chris
>>
>> [ 91.605109] ------------[ cut here ]------------
>> [ 91.605122] kernel BUG at net/sunrpc/svc.c:570!
>> [ 91.605129] Internal error: Oops - BUG: 00000000f2000800 [#1]
>> PREEMPT SMP
>> [ 91.610643] Modules linked in: mvcpss(O) platform_driver(O)
>> ipifwd(O) xt_l2tp xt_hashlimit xt_conntrack xt_addrtype xt_LOG
>> xt_CHECKSUM wp512 vxlan veth twofish_generic twofish_common sr9800
>> smsc95xx smsc75xx smsc sm3_generic sha512_arm64 sha3_generic
>> serpent_generic rtl8150 rpcsec_gss_krb5 rmd160 poly1305_generic plusb
>> pegasus optee_rng nbd microchip md4 md_mod mcs7830 lrw libpoly1305
>> lan78xx l2tp_ip6 l2tp_ip l2tp_eth l2tp_netlink l2tp_core udp_tunnel
>> ipt_REJECT nf_reject_ipv4 ip6table_nat ip6table_mangle
>> ip6table_filter ip6t_ipv6header ip6t_REJECT ip6_udp_tunnel ip6_tables
>> dm9601 dm_zero dm_mirror dm_region_hash dm_log dm_mod diag tipc cuse
>> cts cpufreq_powersave cpufreq_conservative chacha_generic
>> chacha20poly1305 chacha_neon libchacha cast6_generic cast5_generic
>> cast_common camellia_generic blowfish_generic blowfish_common
>> auth_rpcgss oid_registry at25 arm_smccc_trng aes_neon_blk
>> idprom_mtd(O) idprom_i2c(O) epi3_boardinfo_i2c(O) x250(O)
>> psuslot_epi3_register(O) psuslot_gpio_group(O)
>> [ 91.610809] psuslot(O)
>> [ 91.611822] watchdog: watchdog1: watchdog did not stop!
>> [ 91.697065] gpiopins_boardinfo(O) idprom(O) epi3_boardinfo(O)
>> boardinfo(O) i2c_gpio i2c_algo_bit i2c_mv64xxx pluggable(O)
>> led_enable(O) omap_rng rng_core atl_reset(O) sbsa_gwdt uio_pdrv_genirq
>> [ 91.697096] CPU: 2 PID: 1770 Comm: nfsd Kdump: loaded Tainted:
>> G O 5.15.155 #1
>> [ 91.697103] Hardware name: Allied Telesis x250-28XTm (DT)
>> [ 91.697107] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS
>> BTYPE=--)
>> [ 91.697112] pc : svc_destroy+0x84/0xac
>> [ 91.701202] watchdog: watchdog0: watchdog did not stop!
>> [ 91.702215] lr : svc_destroy+0x2c/0xac
>> [ 91.702220] sp : ffff80000bb3bde0
>> [ 91.702223] x29: ffff80000bb3bde0 x28: 0000000000000000 x27:
>> 0000000000000000
>> [ 91.746095] x26: 0000000000000000 x25: ffff00000dbfaa40 x24:
>> ffff000016c14000
>> [ 91.746101] x23: ffff800008395c00 x22: ffff00000ee9f284 x21:
>> ffff00000eea9e10
>> [ 91.746108] x20: ffff00000eea9e00 x19: ffff00000eea9e14 x18:
>> ffff800008e99000
>> [ 91.769526] x17: 0000000000000006 x16: 0000000000000000 x15:
>> 0000000000000001
>> [ 91.776782] x14: 00000000fffffffd x13: fffffc0000000000 x12:
>> ffff800076bc2000
>> [ 91.784031] x11: ffff00007fba5c10 x10: ffff800076bc2000 x9 :
>> ffff8000092207c0
>> [ 91.784038] x8 : fffffc000055eb08 x7 : ffff00000ef6c4c0 x6 :
>> fffffc0001f872c8
>> [ 91.795823] x5 : 0000000000000100 x4 : ffff00007fbaeda8 x3 :
>> 0000000000000000
>> [ 91.801684] x2 : 0000000000000000 x1 : ffff00000d8f8018 x0 :
>> ffff00000eea9e30
>> [ 91.807545] Call trace:
>> [ 91.810088] svc_destroy+0x84/0xac
>> [ 91.813586] svc_exit_thread+0x108/0x15c
>> [ 91.816998] nfsd+0x178/0x1a0
>> [ 91.818673] kthread+0x150/0x160
>> [ 91.820610] ret_from_fork+0x10/0x20
>> [ 91.820620] Code: a94153f3 a8c27bfd d50323bf d65f03c0 (d4210000)
>> [ 91.820629] SMP: stopping secondary CPUs
>> [ 91.830433] Starting crashdump kernel...
>> [ 91.833064] Bye!
Powered by blists - more mailing lists