lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141103070608.GC1189@wfg-t540p.sh.intel.com>
Date:	Mon, 3 Nov 2014 15:06:08 +0800
From:	Fengguang Wu <fengguang.wu@...el.com>
To:	Aaron Tomlin <atomlin@...hat.com>,
	Steven Rostedt <rostedt@...dmis.org>
Cc:	Ingo Molnar <mingo@...nel.org>, LKP <lkp@...org>,
	linux-kernel@...r.kernel.org
Subject: [tracer branch] kernel BUG at kernel/sched/core.c:2697!

Hi Aaron,

FYI your patch triggered a BUG on an existing old bug.
Let's hope it provides more info to debug the problem.

commit 0d9e26329b0c9263d4d9e0422d80a0e73268c52f
Author:     Aaron Tomlin <atomlin@...hat.com>
AuthorDate: Fri Sep 12 14:16:19 2014 +0100
Commit:     Ingo Molnar <mingo@...nel.org>
CommitDate: Fri Sep 19 12:35:24 2014 +0200

    sched: Add default-disabled option to BUG() when stack end location is overwritten
    
    Currently in the event of a stack overrun a call to schedule()
    does not check for this type of corruption. This corruption is
    often silent and can go unnoticed. However once the corrupted
    region is examined at a later stage, the outcome is undefined
    and often results in a sporadic page fault which cannot be
    handled.
    
    This patch checks for a stack overrun and takes appropriate
    action since the damage is already done, there is no point
    in continuing.
    
    Signed-off-by: Aaron Tomlin <atomlin@...hat.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
    Cc: aneesh.kumar@...ux.vnet.ibm.com
    Cc: dzickus@...hat.com
    Cc: bmr@...hat.com
    Cc: jcastillo@...hat.com
    Cc: oleg@...hat.com
    Cc: riel@...hat.com
    Cc: prarit@...hat.com
    Cc: jgh@...hat.com
    Cc: minchan@...nel.org
    Cc: mpe@...erman.id.au
    Cc: tglx@...utronix.de
    Cc: rostedt@...dmis.org
    Cc: hannes@...xchg.org
    Cc: Alexei Starovoitov <ast@...mgrid.com>
    Cc: Al Viro <viro@...iv.linux.org.uk>
    Cc: Andi Kleen <ak@...ux.intel.com>
    Cc: Andrew Morton <akpm@...ux-foundation.org>
    Cc: Dan Streetman <ddstreet@...e.org>
    Cc: Davidlohr Bueso <davidlohr@...com>
    Cc: David S. Miller <davem@...emloft.net>
    Cc: Kees Cook <keescook@...omium.org>
    Cc: Linus Torvalds <torvalds@...ux-foundation.org>
    Cc: Lubomir Rintel <lkundrak@...sk>
    Cc: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
    Link: http://lkml.kernel.org/r/1410527779-8133-4-git-send-email-atomlin@redhat.com
    Signed-off-by: Ingo Molnar <mingo@...nel.org>

===================================================
PARENT COMMIT NOT CLEAN. LOOK OUT FOR WRONG BISECT!
===================================================

Attached dmesg for the parent commit, too, to help confirm whether it is a noise error.

+---------------------------------------------------+------------+------------+-----------+
|                                                   | a70857e46d | 0d9e26329b | v3.18-rc2 |
+---------------------------------------------------+------------+------------+-----------+
| boot_successes                                    | 0          | 0          | 0         |
| boot_failures                                     | 312        | 78         | 42        |
| BUG:kernel_boot_hang                              | 85         | 0          | 8         |
| BUG:kernel_boot_crashed                           | 168        | 0          | 7         |
| kernel_BUG_at_arch/x86/mm/physaddr.c              | 3          |            |           |
| invalid_opcode                                    | 3          | 78         | 27        |
| EIP_is_at__phys_addr                              | 3          |            |           |
| Kernel_panic-not_syncing:Fatal_exception          | 57         | 78         | 27        |
| BUG:unable_to_handle_kernel                       | 54         |            |           |
| Oops                                              | 54         |            |           |
| EIP_is_at_dequeue_task_fair                       | 2          |            |           |
| backtrace:schedule                                | 53         |            |           |
| BUG:spinlock_bad_magic_on_CPU                     | 3          |            |           |
| WARNING:at_kernel/trace/trace.c:register_tracer() | 3          |            |           |
| backtrace:register_tracer                         | 2          | 78         | 27        |
| backtrace:init_branch_tracer                      | 2          | 78         | 27        |
| backtrace:kernel_init_freeable                    | 2          | 78         | 27        |
| backtrace:kobject_create_and_add                  | 1          |            |           |
| backtrace:debugfs_init                            | 1          |            |           |
| backtrace:securityfs_init                         | 1          |            |           |
| backtrace:bus_register                            | 1          |            |           |
| backtrace:virtio_init                             | 1          |            |           |
| backtrace:panic                                   | 1          |            |           |
| EIP_is_at_parameqn                                | 1          |            |           |
| backtrace:parse_args                              | 1          |            |           |
| EIP_is_at_put_prev_task_fair                      | 51         |            |           |
| kernel_BUG_at_kernel/sched/core.c                 | 0          | 78         | 27        |
| EIP_is_at__schedule                               | 0          | 78         | 27        |
+---------------------------------------------------+------------+------------+-----------+

[    0.537047] Testing ftrace regs(no arch support): PASSED
[    0.740057] Testing tracer branch: 
[    0.830265] ------------[ cut here ]------------
[    0.830889] kernel BUG at kernel/sched/core.c:2697!
[    0.831656] invalid opcode: 0000 [#1] PREEMPT SMP 
[    0.832314] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.17.0-rc4-00046-g0d9e263 #2
[    0.833195] task: 52024430 ti: 52034000 task.ti: 52034000
[    0.833842] EIP: 0060:[<4cfaa8b6>] EFLAGS: 00010202 CPU: 0
[    0.834500] EIP is at __schedule+0x8e/0x126d
[    0.835019] EAX: 00000001 EBX: 00000001 ECX: 00000206 EDX: 52024430
[    0.835766] ESI: 00000001 EDI: 00000000 EBP: 52035e80 ESP: 52035e0c
[    0.836510]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[    0.837147] CR0: 8005003b CR2: ffffffff CR3: 0df58000 CR4: 000006d0
[    0.837890] Stack:
[    0.838143]  4df4c940 4df4c940 4e6fa080 4da171d4 52035e30 4bf1b8fc 4e6fa080 52501940
[    0.839290]  52024430 52035e44 4bf1b8fc 4e6fa080 4da18b24 00000000 52035e78 4bebb4a4
[    0.840000]  52035e78 00000206 00000000 00000000 52502308 52042d98 00000206 00000000
[    0.840000] Call Trace:
[    0.840000]  [<4bf1b8fc>] ? trace_buffer_lock_reserve+0xf/0x31
[    0.840000]  [<4bf1b8fc>] ? trace_buffer_lock_reserve+0xf/0x31
[    0.840000]  [<4bebb4a4>] ? trace_hardirqs_on+0xb/0xd
[    0.840000]  [<4cfabb36>] schedule+0xa1/0xa4
[    0.840000]  [<4cfb39ca>] schedule_timeout+0x34f/0x37e
[    0.840000]  [<4bee1f73>] ? migrate_timer_list+0x247/0x247
[    0.840000]  [<4cfb3a4b>] schedule_timeout_uninterruptible+0x1a/0x1c
[    0.840000]  [<4bee4673>] msleep+0x17/0x1b
[    0.840000]  [<4bf1e321>] trace_selftest_startup_branch+0x34/0x72
[    0.840000]  [<4bf1e69e>] register_tracer+0x113/0x204
[    0.840000]  [<4dea2668>] ? stack_trace_init+0x77/0x77
[    0.840000]  [<4dea2695>] init_branch_tracer+0x2d/0x2f
[    0.840000]  [<4de7f00c>] do_one_initcall+0x12a/0x27b
[    0.840000]  [<4c415512>] ? strlen+0x9/0x1c
[    0.840000]  [<4be92117>] ? parse_args+0x36a/0x467
[    0.840000]  [<4de7f245>] kernel_init_freeable+0xe8/0x1aa
[    0.840000]  [<4cf87336>] kernel_init+0xe/0x13c
[    0.840000]  [<4cfb59a1>] ret_from_kernel_thread+0x21/0x30
[    0.840000]  [<4cf87328>] ? rest_init+0x12e/0x12e
[    0.840000] Code: 0f b6 f3 89 f2 e8 d1 78 f7 fe 31 c9 b8 84 71 a1 4d 89 f2 e8 c3 78 f7 fe 8b 04 b5 10 99 a9 4d 40 84 db 89 04 b5 10 99 a9 4d 74 02 <0f> 0b 64 a1 b0 66 f4 4d 25 ff ff df 7f 31 db 48 74 0d 8b 45 ac
[    0.840000] EIP: [<4cfaa8b6>] __schedule+0x8e/0x126d SS:ESP 0068:52035e0c
[    0.840025] ---[ end trace 0d216f9877d1d8ba ]---
[    0.840581] Kernel panic - not syncing: Fatal exception

git bisect start cac7f2429872d3733dc3f9915857b1691da2eb2f v3.17 --
git bisect  bad bf10fa857f0604865006d9705e63415b9d4e0d62  # 00:42      0-    103  Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good b528392669415dc1e53a047215e5ad6c2de879fc  # 00:54     78+     78  Merge tag 'pm+acpi-3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
git bisect good 052db7ec86dff26f734031c3ef5c2c03a94af0af  # 01:02     78+     78  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
git bisect good 77c688ac87183537ed0fb84ec2cb8fa8ec97c458  # 01:10     78+     78  Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
git bisect good ebf546cc5391b9a8a17c1196b05b4357ef0138a2  # 01:23     78+     78  Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect  bad 197fe6b0e6843b6859c6a1436ff19e3c444c0502  # 01:28      0-      1  Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 13ead805c5a14b0e7ecd34f61404a5bfba655895  # 01:49     78+     78  Merge branch 'perf-watchdog-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect  bad faafcba3b5e15999cf75d5c5a513ac8e47e2545f  # 01:54      0-      1  Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 9c368b5b6eccce1cbd7f68142106b3b4ddb1c5b5  # 02:12     78+     78  sched, time: Fix lock inversion in thread_group_cputime()
git bisect  bad a5e7be3b28a235108c59561bea55eea1072b23b0  # 02:16     12-     68  sched/deadline: Clear dl_entity params when setscheduling to different class
git bisect  bad 0d9e26329b0c9263d4d9e0422d80a0e73268c52f  # 02:21      0-     78  sched: Add default-disabled option to BUG() when stack end location is overwritten
git bisect good f3f1768f89d601ad29f4701deef91caaa82b9f57  # 02:30     78+     78  sched/rt: Remove useless if from cleanup pick_next_task_rt()
git bisect good a15b12ac36ad4e7b856a4ae54937ae26a51aebad  # 02:38     78+     78  sched: Do not stop cpu in set_cpus_allowed_ptr() if task is not running
git bisect good a70857e46dd13e87ae06bf0e64cb6a2d4f436265  # 02:48     78+     78  sched: Add helper for task stack page overrun checking
# first bad commit: [0d9e26329b0c9263d4d9e0422d80a0e73268c52f] sched: Add default-disabled option to BUG() when stack end location is overwritten
git bisect good a70857e46dd13e87ae06bf0e64cb6a2d4f436265  # 03:06    234+    312  sched: Add helper for task stack page overrun checking
git bisect  bad 4fbe40970dc154aaeeda0584aab8913fc073127b  # 03:08    190-    194  Add linux-next specific files for 20141031
git bisect  bad 12d7aacab56e9ef185c3a5512e867bfd3a9504e4  # 03:14      0-    109  Merge tag 'staging-3.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
git bisect  bad 4fbe40970dc154aaeeda0584aab8913fc073127b  # 03:14      0-    236  Add linux-next specific files for 20141031


This script may reproduce the error.

----------------------------------------------------------------------------
#!/bin/bash

kernel=$1

kvm=(
	qemu-system-x86_64
	-cpu kvm64
	-enable-kvm
	-kernel $kernel
	-m 320
	-smp 2
	-net nic,vlan=1,model=e1000
	-net user,vlan=1
	-boot order=nc
	-no-reboot
	-watchdog i6300esb
	-rtc base=localtime
	-serial stdio
	-display none
	-monitor null 
)

append=(
	hung_task_panic=1
	earlyprintk=ttyS0,115200
	debug
	apic=debug
	sysrq_always_enabled
	rcupdate.rcu_cpu_stall_timeout=100
	panic=-1
	softlockup_panic=1
	nmi_watchdog=panic
	oops=panic
	load_ramdisk=2
	prompt_ramdisk=0
	console=ttyS0,115200
	console=tty0
	vga=normal
	root=/dev/ram0
	rw
	drbd.minor_count=8
)

"${kvm[@]}" --append "${append[*]}"
----------------------------------------------------------------------------

Thanks,
Fengguang

View attachment "dmesg-quantal-ivb42-110:20141103021947:i386-randconfig-ib1-11021418:3.17.0-rc4-00046-g0d9e263:2" of type "text/plain" (19121 bytes)

View attachment "dmesg-quantal-ivb42-100:20141103025355:i386-randconfig-ib1-11021418:3.17.0-rc4-00045-ga70857e:1" of type "text/plain" (16636 bytes)

View attachment "config-3.17.0-rc4-00046-g0d9e263" of type "text/plain" (88066 bytes)

_______________________________________________
LKP mailing list
LKP@...ux.intel.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ