[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141103070608.GC1189@wfg-t540p.sh.intel.com>
Date: Mon, 3 Nov 2014 15:06:08 +0800
From: Fengguang Wu <fengguang.wu@...el.com>
To: Aaron Tomlin <atomlin@...hat.com>,
Steven Rostedt <rostedt@...dmis.org>
Cc: Ingo Molnar <mingo@...nel.org>, LKP <lkp@...org>,
linux-kernel@...r.kernel.org
Subject: [tracer branch] kernel BUG at kernel/sched/core.c:2697!
Hi Aaron,
FYI your patch triggered a BUG on an existing old bug.
Let's hope it provides more info to debug the problem.
commit 0d9e26329b0c9263d4d9e0422d80a0e73268c52f
Author: Aaron Tomlin <atomlin@...hat.com>
AuthorDate: Fri Sep 12 14:16:19 2014 +0100
Commit: Ingo Molnar <mingo@...nel.org>
CommitDate: Fri Sep 19 12:35:24 2014 +0200
sched: Add default-disabled option to BUG() when stack end location is overwritten
Currently in the event of a stack overrun a call to schedule()
does not check for this type of corruption. This corruption is
often silent and can go unnoticed. However once the corrupted
region is examined at a later stage, the outcome is undefined
and often results in a sporadic page fault which cannot be
handled.
This patch checks for a stack overrun and takes appropriate
action since the damage is already done, there is no point
in continuing.
Signed-off-by: Aaron Tomlin <atomlin@...hat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
Cc: aneesh.kumar@...ux.vnet.ibm.com
Cc: dzickus@...hat.com
Cc: bmr@...hat.com
Cc: jcastillo@...hat.com
Cc: oleg@...hat.com
Cc: riel@...hat.com
Cc: prarit@...hat.com
Cc: jgh@...hat.com
Cc: minchan@...nel.org
Cc: mpe@...erman.id.au
Cc: tglx@...utronix.de
Cc: rostedt@...dmis.org
Cc: hannes@...xchg.org
Cc: Alexei Starovoitov <ast@...mgrid.com>
Cc: Al Viro <viro@...iv.linux.org.uk>
Cc: Andi Kleen <ak@...ux.intel.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>
Cc: Dan Streetman <ddstreet@...e.org>
Cc: Davidlohr Bueso <davidlohr@...com>
Cc: David S. Miller <davem@...emloft.net>
Cc: Kees Cook <keescook@...omium.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Lubomir Rintel <lkundrak@...sk>
Cc: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1410527779-8133-4-git-send-email-atomlin@redhat.com
Signed-off-by: Ingo Molnar <mingo@...nel.org>
===================================================
PARENT COMMIT NOT CLEAN. LOOK OUT FOR WRONG BISECT!
===================================================
Attached dmesg for the parent commit, too, to help confirm whether it is a noise error.
+---------------------------------------------------+------------+------------+-----------+
| | a70857e46d | 0d9e26329b | v3.18-rc2 |
+---------------------------------------------------+------------+------------+-----------+
| boot_successes | 0 | 0 | 0 |
| boot_failures | 312 | 78 | 42 |
| BUG:kernel_boot_hang | 85 | 0 | 8 |
| BUG:kernel_boot_crashed | 168 | 0 | 7 |
| kernel_BUG_at_arch/x86/mm/physaddr.c | 3 | | |
| invalid_opcode | 3 | 78 | 27 |
| EIP_is_at__phys_addr | 3 | | |
| Kernel_panic-not_syncing:Fatal_exception | 57 | 78 | 27 |
| BUG:unable_to_handle_kernel | 54 | | |
| Oops | 54 | | |
| EIP_is_at_dequeue_task_fair | 2 | | |
| backtrace:schedule | 53 | | |
| BUG:spinlock_bad_magic_on_CPU | 3 | | |
| WARNING:at_kernel/trace/trace.c:register_tracer() | 3 | | |
| backtrace:register_tracer | 2 | 78 | 27 |
| backtrace:init_branch_tracer | 2 | 78 | 27 |
| backtrace:kernel_init_freeable | 2 | 78 | 27 |
| backtrace:kobject_create_and_add | 1 | | |
| backtrace:debugfs_init | 1 | | |
| backtrace:securityfs_init | 1 | | |
| backtrace:bus_register | 1 | | |
| backtrace:virtio_init | 1 | | |
| backtrace:panic | 1 | | |
| EIP_is_at_parameqn | 1 | | |
| backtrace:parse_args | 1 | | |
| EIP_is_at_put_prev_task_fair | 51 | | |
| kernel_BUG_at_kernel/sched/core.c | 0 | 78 | 27 |
| EIP_is_at__schedule | 0 | 78 | 27 |
+---------------------------------------------------+------------+------------+-----------+
[ 0.537047] Testing ftrace regs(no arch support): PASSED
[ 0.740057] Testing tracer branch:
[ 0.830265] ------------[ cut here ]------------
[ 0.830889] kernel BUG at kernel/sched/core.c:2697!
[ 0.831656] invalid opcode: 0000 [#1] PREEMPT SMP
[ 0.832314] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.17.0-rc4-00046-g0d9e263 #2
[ 0.833195] task: 52024430 ti: 52034000 task.ti: 52034000
[ 0.833842] EIP: 0060:[<4cfaa8b6>] EFLAGS: 00010202 CPU: 0
[ 0.834500] EIP is at __schedule+0x8e/0x126d
[ 0.835019] EAX: 00000001 EBX: 00000001 ECX: 00000206 EDX: 52024430
[ 0.835766] ESI: 00000001 EDI: 00000000 EBP: 52035e80 ESP: 52035e0c
[ 0.836510] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 0.837147] CR0: 8005003b CR2: ffffffff CR3: 0df58000 CR4: 000006d0
[ 0.837890] Stack:
[ 0.838143] 4df4c940 4df4c940 4e6fa080 4da171d4 52035e30 4bf1b8fc 4e6fa080 52501940
[ 0.839290] 52024430 52035e44 4bf1b8fc 4e6fa080 4da18b24 00000000 52035e78 4bebb4a4
[ 0.840000] 52035e78 00000206 00000000 00000000 52502308 52042d98 00000206 00000000
[ 0.840000] Call Trace:
[ 0.840000] [<4bf1b8fc>] ? trace_buffer_lock_reserve+0xf/0x31
[ 0.840000] [<4bf1b8fc>] ? trace_buffer_lock_reserve+0xf/0x31
[ 0.840000] [<4bebb4a4>] ? trace_hardirqs_on+0xb/0xd
[ 0.840000] [<4cfabb36>] schedule+0xa1/0xa4
[ 0.840000] [<4cfb39ca>] schedule_timeout+0x34f/0x37e
[ 0.840000] [<4bee1f73>] ? migrate_timer_list+0x247/0x247
[ 0.840000] [<4cfb3a4b>] schedule_timeout_uninterruptible+0x1a/0x1c
[ 0.840000] [<4bee4673>] msleep+0x17/0x1b
[ 0.840000] [<4bf1e321>] trace_selftest_startup_branch+0x34/0x72
[ 0.840000] [<4bf1e69e>] register_tracer+0x113/0x204
[ 0.840000] [<4dea2668>] ? stack_trace_init+0x77/0x77
[ 0.840000] [<4dea2695>] init_branch_tracer+0x2d/0x2f
[ 0.840000] [<4de7f00c>] do_one_initcall+0x12a/0x27b
[ 0.840000] [<4c415512>] ? strlen+0x9/0x1c
[ 0.840000] [<4be92117>] ? parse_args+0x36a/0x467
[ 0.840000] [<4de7f245>] kernel_init_freeable+0xe8/0x1aa
[ 0.840000] [<4cf87336>] kernel_init+0xe/0x13c
[ 0.840000] [<4cfb59a1>] ret_from_kernel_thread+0x21/0x30
[ 0.840000] [<4cf87328>] ? rest_init+0x12e/0x12e
[ 0.840000] Code: 0f b6 f3 89 f2 e8 d1 78 f7 fe 31 c9 b8 84 71 a1 4d 89 f2 e8 c3 78 f7 fe 8b 04 b5 10 99 a9 4d 40 84 db 89 04 b5 10 99 a9 4d 74 02 <0f> 0b 64 a1 b0 66 f4 4d 25 ff ff df 7f 31 db 48 74 0d 8b 45 ac
[ 0.840000] EIP: [<4cfaa8b6>] __schedule+0x8e/0x126d SS:ESP 0068:52035e0c
[ 0.840025] ---[ end trace 0d216f9877d1d8ba ]---
[ 0.840581] Kernel panic - not syncing: Fatal exception
git bisect start cac7f2429872d3733dc3f9915857b1691da2eb2f v3.17 --
git bisect bad bf10fa857f0604865006d9705e63415b9d4e0d62 # 00:42 0- 103 Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good b528392669415dc1e53a047215e5ad6c2de879fc # 00:54 78+ 78 Merge tag 'pm+acpi-3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
git bisect good 052db7ec86dff26f734031c3ef5c2c03a94af0af # 01:02 78+ 78 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
git bisect good 77c688ac87183537ed0fb84ec2cb8fa8ec97c458 # 01:10 78+ 78 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
git bisect good ebf546cc5391b9a8a17c1196b05b4357ef0138a2 # 01:23 78+ 78 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 197fe6b0e6843b6859c6a1436ff19e3c444c0502 # 01:28 0- 1 Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 13ead805c5a14b0e7ecd34f61404a5bfba655895 # 01:49 78+ 78 Merge branch 'perf-watchdog-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad faafcba3b5e15999cf75d5c5a513ac8e47e2545f # 01:54 0- 1 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 9c368b5b6eccce1cbd7f68142106b3b4ddb1c5b5 # 02:12 78+ 78 sched, time: Fix lock inversion in thread_group_cputime()
git bisect bad a5e7be3b28a235108c59561bea55eea1072b23b0 # 02:16 12- 68 sched/deadline: Clear dl_entity params when setscheduling to different class
git bisect bad 0d9e26329b0c9263d4d9e0422d80a0e73268c52f # 02:21 0- 78 sched: Add default-disabled option to BUG() when stack end location is overwritten
git bisect good f3f1768f89d601ad29f4701deef91caaa82b9f57 # 02:30 78+ 78 sched/rt: Remove useless if from cleanup pick_next_task_rt()
git bisect good a15b12ac36ad4e7b856a4ae54937ae26a51aebad # 02:38 78+ 78 sched: Do not stop cpu in set_cpus_allowed_ptr() if task is not running
git bisect good a70857e46dd13e87ae06bf0e64cb6a2d4f436265 # 02:48 78+ 78 sched: Add helper for task stack page overrun checking
# first bad commit: [0d9e26329b0c9263d4d9e0422d80a0e73268c52f] sched: Add default-disabled option to BUG() when stack end location is overwritten
git bisect good a70857e46dd13e87ae06bf0e64cb6a2d4f436265 # 03:06 234+ 312 sched: Add helper for task stack page overrun checking
git bisect bad 4fbe40970dc154aaeeda0584aab8913fc073127b # 03:08 190- 194 Add linux-next specific files for 20141031
git bisect bad 12d7aacab56e9ef185c3a5512e867bfd3a9504e4 # 03:14 0- 109 Merge tag 'staging-3.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
git bisect bad 4fbe40970dc154aaeeda0584aab8913fc073127b # 03:14 0- 236 Add linux-next specific files for 20141031
This script may reproduce the error.
----------------------------------------------------------------------------
#!/bin/bash
kernel=$1
kvm=(
qemu-system-x86_64
-cpu kvm64
-enable-kvm
-kernel $kernel
-m 320
-smp 2
-net nic,vlan=1,model=e1000
-net user,vlan=1
-boot order=nc
-no-reboot
-watchdog i6300esb
-rtc base=localtime
-serial stdio
-display none
-monitor null
)
append=(
hung_task_panic=1
earlyprintk=ttyS0,115200
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
panic=-1
softlockup_panic=1
nmi_watchdog=panic
oops=panic
load_ramdisk=2
prompt_ramdisk=0
console=ttyS0,115200
console=tty0
vga=normal
root=/dev/ram0
rw
drbd.minor_count=8
)
"${kvm[@]}" --append "${append[*]}"
----------------------------------------------------------------------------
Thanks,
Fengguang
View attachment "dmesg-quantal-ivb42-110:20141103021947:i386-randconfig-ib1-11021418:3.17.0-rc4-00046-g0d9e263:2" of type "text/plain" (19121 bytes)
View attachment "dmesg-quantal-ivb42-100:20141103025355:i386-randconfig-ib1-11021418:3.17.0-rc4-00045-ga70857e:1" of type "text/plain" (16636 bytes)
View attachment "config-3.17.0-rc4-00046-g0d9e263" of type "text/plain" (88066 bytes)
_______________________________________________
LKP mailing list
LKP@...ux.intel.com
Powered by blists - more mailing lists