[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171219175921.7db9b0e1@cakuba.netronome.com>
Date: Tue, 19 Dec 2017 17:59:21 -0800
From: Jakub Kicinski <kubakici@...pl>
To: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Jiri Pirko <jiri@...nulli.us>,
Cong Wang <xiyou.wangcong@...il.com>
Subject: RCU callback crashes
Hi!
If I run the netdevsim test long enough on a kernel with no debugging
I get this:
[ 1400.450124] BUG: unable to handle kernel paging request at 000000046474e552
[ 1400.458005] IP: 0x46474e552
[ 1400.461231] PGD 0 P4D 0
[ 1400.464150] Oops: 0010 [#1] PREEMPT SMP
[ 1400.468525] Modules linked in: cls_bpf sch_ingress algif_hash af_alg netdevsim rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace f3
[ 1400.516951] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.15.0-rc3-perf-00918-g129c9981a55f #918
[ 1400.526678] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 11/08/2016
[ 1400.535150] RIP: 0010:0x46474e552
[ 1400.538941] RSP: 0018:ffff9f736f083f08 EFLAGS: 00010216
[ 1400.544870] RAX: ffff9f736b4771b8 RBX: ffff9f736f09b880 RCX: ffff9f736b4771b8
[ 1400.552935] RDX: 000000046474e552 RSI: ffff9f736f083f18 RDI: ffff9f736b4771b8
[ 1400.561001] RBP: ffffffff8bc4a740 R08: ffff9f736b4771b8 R09: 0000000000000000
[ 1400.569066] R10: ffff9f736f083d90 R11: 0000000000000000 R12: ffff9f736f09b8b8
[ 1400.577132] R13: 000000000000000a R14: 7fffffffffffffff R15: 0000000000000202
[ 1400.585197] FS: 0000000000000000(0000) GS:ffff9f736f080000(0000) knlGS:0000000000000000
[ 1400.594349] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1400.600859] CR2: 000000046474e552 CR3: 0000000839c09001 CR4: 00000000003606e0
[ 1400.608917] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1400.616982] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1400.625048] Call Trace:
[ 1400.627868] <IRQ>
[ 1400.630207] ? rcu_process_callbacks+0x1a0/0x4d0
[ 1400.635458] ? __do_softirq+0xd1/0x30a
[ 1400.639739] ? irq_exit+0xae/0xb0
[ 1400.643532] ? smp_apic_timer_interrupt+0x60/0x140
[ 1400.648977] ? apic_timer_interrupt+0x8c/0xa0
[ 1400.653934] </IRQ>
[ 1400.656370] ? cpuidle_enter_state+0xb0/0x2f0
[ 1400.661328] ? cpuidle_enter_state+0x8d/0x2f0
[ 1400.666287] ? do_idle+0x17b/0x1d0
[ 1400.670167] ? cpu_startup_entry+0x5f/0x70
[ 1400.674836] ? start_secondary+0x169/0x190
[ 1400.679504] ? secondary_startup_64+0xa5/0xb0
[ 1400.684466] Code: Bad RIP value.
[ 1400.688259] RIP: 0x46474e552 RSP: ffff9f736f083f08
[ 1400.693703] CR2: 000000046474e552
[ 1400.697501] ---[ end trace fab2c0fb826644df ]---
[ 1400.708442] Kernel panic - not syncing: Fatal exception in interrupt
[ 1400.715693] Kernel Offset: 0xa000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 1400.732994] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
Unfortunately reproducing the crash on an instrumented kernel seems to
be difficult..
I managed to gather this:
[ 26.157415] ------------[ cut here ]------------
[ 26.162670] ODEBUG: free active (active state 1) object type: rcu_head hint: (null)
[ 26.172361] WARNING: CPU: 19 PID: 1352 at ../lib/debugobjects.c:291 debug_print_object+0x64/0x80
[ 26.182288] Modules linked in: cls_bpf sch_ingress algif_hash af_alg netdevsim rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace f3
[ 26.230728] CPU: 19 PID: 1352 Comm: tc Not tainted 4.15.0-rc3-perf-00918-g129c9981a55f #4
[ 26.239977] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 11/08/2016
[ 26.248453] RIP: 0010:debug_print_object+0x64/0x80
[ 26.253896] RSP: 0018:ffffb7340410fa00 EFLAGS: 00010086
[ 26.259825] RAX: 0000000000000051 RBX: ffff8f1f6b7cc5a0 RCX: 0000000000000006
[ 26.267892] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff8f1f6f48cdd0
[ 26.275959] RBP: ffffffffb3c48600 R08: 0000000000000000 R09: 00000000000005f2
[ 26.284042] R10: 000000000000001e R11: ffffffffb41c35ad R12: ffffffffb3a1d101
[ 26.292125] R13: ffff8f1f6b7cc5a0 R14: ffffffffb423a8b8 R15: 0000000000000001
[ 26.300194] FS: 00007f64d4956700(0000) GS:ffff8f1f6f480000(0000) knlGS:0000000000000000
[ 26.309346] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 26.315859] CR2: 0000000001cbc498 CR3: 000000086a8a2004 CR4: 00000000003606e0
[ 26.323925] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 26.331994] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 26.331994] Call Trace:
[ 26.331998] debug_check_no_obj_freed+0x1e6/0x220
[ 26.332020] ? qdisc_graft+0x14f/0x450
[ 26.332025] kfree+0x14d/0x1b0
[ 26.332027] qdisc_graft+0x14f/0x450
[ 26.332029] tc_get_qdisc+0x12f/0x200
[ 26.332035] rtnetlink_rcv_msg+0x122/0x310
[ 26.332039] ? __skb_try_recv_datagram+0xef/0x150
[ 26.332040] ? __kmalloc_node_track_caller+0x205/0x2b0
[ 26.332042] ? rtnl_calcit.isra.12+0x100/0x100
[ 26.332044] netlink_rcv_skb+0x8d/0x130
[ 26.332046] netlink_unicast+0x16a/0x210
[ 26.332048] netlink_sendmsg+0x32a/0x370
[ 26.332054] sock_sendmsg+0x2d/0x40
[ 26.332056] ___sys_sendmsg+0x298/0x2e0
[ 26.332061] ? mem_cgroup_commit_charge+0x7a/0x540
[ 26.332062] ? mem_cgroup_try_charge+0x8e/0x1d0
[ 26.332066] ? __handle_mm_fault+0x3a1/0x1190
[ 26.332068] ? __sys_sendmsg+0x41/0x70
[ 26.332069] __sys_sendmsg+0x41/0x70
[ 26.332074] entry_SYSCALL_64_fastpath+0x1e/0x81
[ 26.332076] RIP: 0033:0x7f64d3b53450
[ 26.332076] RSP: 002b:00007fffb5ea4388 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 26.332077] RAX: ffffffffffffffda RBX: 00007f64d3e0fb20 RCX: 00007f64d3b53450
[ 26.332078] RDX: 0000000000000000 RSI: 00007fffb5ea43e0 RDI: 0000000000000003
[ 26.332078] RBP: 0000000000000a11 R08: 0000000000000000 R09: 000000000000000f
[ 26.332079] R10: 00000000000005e7 R11: 0000000000000246 R12: 00007f64d3e0fb78
[ 26.332079] R13: 00007f64d3e0fb78 R14: 000000000000270f R15: 00007f64d3e0fb78
[ 26.332081] Code: c1 83 c2 01 8b 4b 14 4c 8b 45 00 89 15 f6 d0 e5 00 8b 53 10 4c 89 e6 48 c7 c7 38 7c a3 b3 48 8b 14 d5 80 3d 85 b
[ 26.332097] ---[ end trace bd33b199ae76ad43 ]---
Powered by blists - more mailing lists