[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110530161014.GS25813@smp.if.uj.edu.pl>
Date: Mon, 30 May 2011 18:10:14 +0200
From: "Witold Baryluk" <baryluk@....if.uj.edu.pl>
To: Andy Lutomirski <luto@....EDU>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Andi Kleen <andi@...stfloor.org>,
Eric Dumazet <eric.dumazet@...il.com>,
Ingo Molnar <mingo@...hat.com>,
John Stultz <johnstul@...ibm.com>,
"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
linux-kernel@...r.kernel.org
Subject: Kernel OOPS in function_graph_tracer due to the
44259b1abfaa8bb819d25d41d71e8e33e25dd36a commit (x86-64: Move vread_tsc
into a new file with sensible options)
Hi,
I found yesterday a problem when booting system on Pentium-M, 32-bit.
I got approximetly this
[ 0.000000] Fast TSC calibration using PIT
[ 0.000000] Detected 3199.363 MHz processor.
[ 0.020197] Calibrating delay loop (skipped), value calculated using timer frequency.. 6398.72 BogoMIPS (lpj=31993630)
[ 0.020763] pid_max: default: 32768 minimum: 301
[ 0.022972] Mount-cache hash table entries: 512
[ 0.040440] CPU: Intel Pentium III (Katmai) stepping 03
[ 0.042789] ftrace: allocating 6263 entries in 13 pages
[ 0.062128] Performance Events:
[ 0.062505] no APIC, boot with the "lapic" boot parameter to force-enable it.
[ 0.062818] no hardware sampling interrupt available.
[ 0.063038] Broken PMU hardware detected, using software events only.
[ 0.072671] Testing tracer nop: PASSED
[ 0.073477] NMI watchdog disabled (cpu0): hardware events not enabled
[ 0.077682] devtmpfs: initialized
[ 0.090118] kworker/u:0 used greatest stack depth: 6840 bytes left
[ 0.105247] atomic64 test passed for i586+ platform with CX8 and with SSE
[ 0.112515] kworker/u:0 used greatest stack depth: 6700 bytes left
[ 0.198527] bio: create slab <bio-0> at 0
[ 0.212671] Switching to clocksource pit
[ 0.882202] kworker/u:0 used greatest stack depth: 6636 bytes left
[ 1.070749] platform rtc_cmos: registered platform RTC device (no PNP device found)
[ 1.114000] Initializing RT-Tester: OK
[ 1.119256] Testing tracer function: PASSED
[ 1.268289] Testing dynamic ftrace: PASSED
[ 1.517992] Testing dynamic ftrace ops #1: (1 0 1 1 0) (1 1 2 1 0) (2 1 3 1 35342) (2 2 4 1 35634) PASSED
[ 1.591847] Testing dynamic ftrace ops #2: (1 0 1 35341 0) (1 1 2 35612 0) (2 1 3 1 8) (2 2 4 249 256) PASSED
[ 1.707799] Testing tracer irqsoff: PASSED
[ 1.733435] Testing tracer preemptoff: PASSED
[ 1.752663] Testing tracer preemptirqsoff: PASSED
[ 1.769764] Testing tracer wakeup:
[ 2.068581] Switching to clocksource tsc
[ 2.108718] PASSED
[ 2.108928] Testing tracer wakeup_rt: PASSED
[ 2.459170] Testing tracer function_graph:
[ 2.466979] BUG: unable to handle kernel paging request at e421cc10
[ 2.466979] IP: [<c10850bc>] ftrace_push_return_trace+0x6c/0xd0
[ 2.466979] *pdpt = 00000000013b4001 *pde = 0000000000000000
[ 2.466979] Thread overran stack, or stack corrupted
[ 2.466979] Oops: 0002 [#1] PREEMPT
[ 2.466979]
[ 2.466979] Pid: 1, comm: swapper Not tainted 2.6.39-t43+ #40 Bochs Bochs
[ 2.466979] EIP: 0060:[<c10850bc>] EFLAGS: 00000006 CPU: 0
[ 2.466979] EIP is at ftrace_push_return_trace+0x6c/0xd0
[ 2.466979] EAX: c7833480 EBX: c7832020 ECX: c1070aac EDX: 1c9e9790
[ 2.466979] ESI: 986b5466 EDI: 00000000 EBP: c78335ac ESP: c7833590
[ 2.466979] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[ 2.466979] Process swapper (pid: 1, ti=c7832000 task=c7832020 task.ti=c7834000)
[ 2.466979] Stack:
[ 2.466979] c105a9fc c78335c0 c1007d26 c1070aac c78335f4 c1007d26 c1070aac c78335d0
[ 2.466979] c10150a2 c7833600 c7833600 c1007d26 00000000 c7832020 c1007d26 c7910800
[ 2.466979] c78335f0 c117e212 c1007d26 c7833638 c7832000 c1007d2b 00000046 c13551c0
[ 2.466979] Call Trace:
[ 2.466979] [<c105a9fc>] ? trace_hardirqs_off+0xc/0x10
[ 2.466979] [<c1007d26>] ? native_sched_clock+0x6/0x70
[ 2.466979] [<c1070aac>] ? trace_clock_local+0x1c/0x50
[ 2.466979] [<c1007d26>] ? native_sched_clock+0x6/0x70
[ 2.466979] [<c1070aac>] ? trace_clock_local+0x1c/0x50
[ 2.466979] [<c10150a2>] prepare_ftrace_return+0x72/0xa0
[ 2.466979] [<c1007d26>] ? native_sched_clock+0x6/0x70
[ 2.466979] [<c1007d26>] ? native_sched_clock+0x6/0x70
[ 2.466979] [<c117e212>] ftrace_graph_caller+0x22/0x26
[ 2.466979] [<c1007d26>] ? native_sched_clock+0x6/0x70
[ 2.466979] [<c1007d2b>] ? native_sched_clock+0xb/0x70
[ 2.466979] [<c117e216>] ftrace_graph_caller+0x26/0x26
[ 2.466979] BUG: unable to handle kernel NULL pointer dereference at 0000032d
[ 2.466979] IP: [<c1005529>] print_context_stack+0xd9/0x110
[ 2.466979] *pdpt = 0000000000000000 *pde = f000ff53f000ff53
[ 2.466979] Thread overran stack, or stack corrupted
[ 2.466979] Oops: 0000 [#2] PREEMPT
[ 2.466979]
[ 2.466979] Pid: 1, comm: swapper Not tainted 2.6.39-t43+ #40 Bochs Bochs
[ 2.466979] EIP: 0060:[<c1005529>] EFLAGS: 00000002 CPU: 0
[ 2.466979] EIP is at print_context_stack+0xd9/0x110
[ 2.466979] EAX: 0000001d EBX: c78335f4 ECX: 00000001 EDX: 00000310
[ 2.466979] ESI: 00000000 EDI: c11806d4 EBP: c7833380 ESP: c783335c
[ 2.466979] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[ 2.466979] Process swapper (pid: 1, ti=c7832000 task=c7832020 task.ti=c7834000)
[ 2.466979] Stack:
[ 2.466979] c117e216 c7833ffc c7833600 ffffe000 c7832000 c7833600 c7833590 c11806d4
[ 2.466979] c7832000 c78333ac c10040dd c11806d4 c11d6c66 00000000 c783339c 00000018
[ 2.466979] 00000000 00000000 c11806d4 00000018 c78333d0 c10052ec c78335ac c11806d4
[ 2.466979] Call Trace:
[ 2.466979] [<c117e216>] ? ftrace_graph_caller+0x26/0x26
[ 2.466979] BUG: unable to handle kernel NULL pointer dereference at 0000032d
[ 2.466979] IP: [<c1005529>] print_context_stack+0xd9/0x110
[ 2.466979] *pdpt = 0000000000000000 *pde = f000ff53f000ff53
[ 2.466979] Thread overran stack, or stack corrupted
[ 2.466979] Oops: 0000 [#3] PREEMPT
[ 2.466979]
[ 2.466979] Pid: 1, comm: swapper Not tainted 2.6.39-t43+ #40 Bochs Bochs
[ 2.466979] EIP: 0060:[<c1005529>] EFLAGS: 00000002 CPU: 0
[ 2.466979] EIP is at print_context_stack+0xd9/0x110
[ 2.466979] EAX: 0000001d EBX: c783335c ECX: 00000001 EDX: 00000310
[ 2.466979] ESI: 00000000 EDI: c11806d4 EBP: c783314c ESP: c7833128
[ 2.466979] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[ 2.466979] Process swapper (pid: 1, ti=c7832000 task=c7832020 task.ti=c7834000)
[ 2.466979] Stack:
[ 2.466979] c117e216 c7833ffc c7833380 ffffe000 c7832000 c7833380 c783335c c11806d4
[ 2.466979] c7832000 c7833178 c10040dd c11806d4 c11d6c66 00000000 c7833168 00000018
[ 2.466979] 00000000 00000000 c11806d4 00000018 c783319c c10052ec c7833380 c11806d4
[ 2.466979] Call Trace:
[ 2.466979] [<c117e216>] ? ftrace_graph_caller+0x26/0x26
[ 2.466979] BUG: unable to handle kernel NULL pointer dereference at 0000032d
[ 2.466979] IP: [<c1005529>] print_context_stack+0xd9/0x110
[ 2.466979] *pdpt = 0000000000000000 *pde = f000ff53f000ff53
[ 2.466979] Thread overran stack, or stack corrupted
[ 2.466979] Oops: 0000 [#4] PREEMPT
[ 2.466979]
[ 2.466979] Pid: 1, comm: swapper Not tainted 2.6.39-t43+ #40 Bochs Bochs
[ 2.466979] EIP: 0060:[<c1005529>] EFLAGS: 00000002 CPU: 0
[ 2.466979] EIP is at print_context_stack+0xd9/0x110
[ 2.466979] EAX: 0000001d EBX: c7833128 ECX: 00000001 EDX: 00000310
[ 2.466979] ESI: 00000000 EDI: c11806d4 EBP: c7832f18 ESP: c7832ef4
[ 2.466979] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[ 2.466979] Process swapper (pid: 1, ti=c7832000 task=c7832020 task.ti=c7834000)
[ 2.466979] Stack:
[ 2.466979] c117e216 c7833ffc c783314c ffffe000 c7832000 c783314c c7833128 c11806d4
[ 2.466979] c7832000 c7832f44 c10040dd c11806d4 c11d6c66 00000000 c7832f34 00000018
full kernel log in attachment.
By bisection I got
commit 44259b1abfaa8bb819d25d41d71e8e33e25dd36a
Author: Andy Lutomirski <luto@....EDU>
Date: Mon May 23 09:31:28 2011 -0400
x86-64: Move vread_tsc into a new file with sensible options
I tested on i386, 32-bit architecture. Both gcc 4.4 and 4.6 result
in same problem. Also -O2 vs -O2 doesn't change anything.
Sometimes bug doesn't appear, or appear differently, for example
[ 1.114660] Initializing RT-Tester: OK
[ 1.118077] Testing tracer function: PASSED
[ 1.272665] Testing dynamic ftrace: PASSED
[ 1.524057] Testing dynamic ftrace ops #1: (1 0 1 1 0) (1 1 2 1 0) (2 1 3 1 44876) (2 2 4 1 45184) PASSED
[ 1.600066] Testing dynamic ftrace ops #2: (1 0 1 44944 0) (1 1 2 45228 0) (2 1 3 1 8) (2 2 4 265 272) PASSED
[ 1.717279] Testing tracer irqsoff: PASSED
[ 1.737368] Testing tracer preemptoff: PASSED
[ 1.759463] Testing tracer preemptirqsoff:
[ 1.773645] ------------[ cut here ]------------
[ 1.774220] WARNING: at kernel/lockdep.c:3351 check_flags+0x126/0x157()
[ 1.774723] Hardware name: Bochs
[ 1.775020] Pid: 1, comm: swapper Not tainted 2.6.39-t43+ #42
[ 1.775332] Call Trace:
[ 1.775560] [<c102213e>] warn_slowpath_common+0x77/0x8c
[ 1.775862] [<c104728d>] ? check_flags+0x126/0x157
[ 1.776138] [<c104728d>] ? check_flags+0x126/0x157
[ 1.776411] [<c1064f5a>] ? T.482+0x114/0x16e
[ 1.776664] [<c1022175>] warn_slowpath_null+0x22/0x24
[ 1.776950] [<c104728d>] check_flags+0x126/0x157
[ 1.777236] [<c104b0a6>] lock_acquire+0x34/0xe7
[ 1.777541] [<c1126672>] _raw_spin_lock_irqsave+0x4e/0x7e
[ 1.777839] [<c1064f5a>] ? T.482+0x114/0x16e
[ 1.778092] [<c1064f5a>] T.482+0x114/0x16e
[ 1.778341] [<c105d62c>] ? ring_buffer_unlock_commit+0x21/0x2e
[ 1.778654] [<c1026ff6>] ? __do_softirq+0x17e/0x191
[ 1.778931] [<c1026ff6>] ? __do_softirq+0x17e/0x191
[ 1.779208] [<c106502a>] stop_critical_timing+0x76/0x87
[ 1.779497] [<c1026d47>] ? __local_bh_enable+0x82/0x85
[ 1.779784] [<c106505a>] trace_preempt_on+0x1f/0x21
[ 1.780061] [<c101c17d>] sub_preempt_count+0x8b/0x98
[ 1.780341] [<c1026d47>] __local_bh_enable+0x82/0x85
[ 1.780621] [<c1026ff6>] __do_softirq+0x17e/0x191
[ 1.780892] [<c1026e78>] ? local_bh_enable+0x12/0x12
[ 1.781160] <IRQ> [<c1026c06>] ? irq_exit+0x44/0x66
[ 1.781536] [<c1003589>] ? do_IRQ+0x75/0x88
[ 1.781789] [<c1127b6e>] ? common_interrupt+0x2e/0x34
[ 1.782076] [<c106151c>] ? trace_selftest_startup_preemptirqsoff+0xae/0x1a9
[ 1.782435] [<c1061914>] ? register_tracer+0x12d/0x1fe
[ 1.782724] [<c1310344>] ? setup_command_line+0x73/0x73
[ 1.783066] [<c131c1b4>] ? init_irqsoff_tracer+0x21/0x25
[ 1.783390] [<c13101fa>] ? do_one_initcall+0x86/0x138
[ 1.783390] [<c131c193>] ? init_function_trace+0x35/0x35
[ 1.783390] [<c1310344>] ? setup_command_line+0x73/0x73
[ 1.783390] [<c13103cb>] ? kernel_init+0x87/0x11b
[ 1.783390] [<c1127b7a>] ? kernel_thread_helper+0x6/0x10
[ 1.783390] ---[ end trace 4eaa2a86a8e2da22 ]---
[ 1.783390] irq event stamp: 133534
[ 1.783390] hardirqs last enabled at (133532): [<c1126cbd>] _raw_spin_unlock_irq+0x27/0x4a
[ 1.783390] hardirqs last disabled at (133533): [<c1026fbf>] __do_softirq+0x147/0x191
[ 1.783390] softirqs last enabled at (133534): [<c1026ff6>] __do_softirq+0x17e/0x191
[ 1.783390] softirqs last disabled at (133529): [<c10037b7>] do_softirq+0x55/0xa6
[ 1.798767] PASSED
[ 1.798767] Testing tracer wakeup:
[ 2.063646] Switching to clocksource tsc
[ 2.133963] PASSED
On recent kernels it looks like
[ 1.159875] Switching to clocksource tsc
[ 1.439760] PASSED
[ 1.440212] Testing tracer function_graph:
[ 1.449049] BUG: unable to handle kernel paging request at e42d01bc
[ 1.449049] IP: [<c107586c>] ftrace_push_return_trace+0x6c/0xd0
[ 1.449049] *pdpt = 0000000001221001 *pde = 0000000000000000
[ 1.449049] Thread overran stack, or stack corrupted
[ 1.449049] Oops: 0002 [#1] PREEMPT
[ 1.449049]
[ 1.449049] Pid: -947754028, comm:
[ 1.449049] ������k������ Not tainted 3.0.0-rc1-t43 #45 Bochs Bochs
[ 1.449049] EIP: 0060:[<c107586c>] EFLAGS: 00000016 CPU: 0
[ 1.449049] EIP is at ftrace_push_return_trace+0x6c/0xd0
[ 1.449049] EAX: c7826e50 EBX: c7826ae0 ECX: c106178c EDX: 1caa936c
[ 1.449049] ESI: 5656002d EDI: 00000000 EBP: c7826f84 ESP: c7826f68
[ 1.449049] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[ 1.449049] Process � (pid: -947753988, ti=c7826000 task=c7826ae0 task.ti=c7828000)
[ 1.449049] Stack:
[ 1.449049] c106178d c7826f98 c1007a26 c106178c c7826fcc c1007a26 c106178c c7826fa8
[ 1.449049] c1014982 c7826fd8 c7826fd8 c1007a26 00000000 c7826ae0 c1007a26 c7941800
[ 1.449049] c7826fc8 c1162192 c1007a26 c7827010 c7826000 c1007a2b c783e95c 00000021
[ 1.449049] Call Trace:
[ 1.449049] [<c106178d>] ? trace_clock_local+0x1d/0x50
[ 1.449049] [<c1007a26>] ? native_sched_clock+0x6/0x70
[ 1.449049] [<c106178c>] ? trace_clock_local+0x1c/0x50
[ 1.449049] [<c1007a26>] ? native_sched_clock+0x6/0x70
[ 1.449049] [<c106178c>] ? trace_clock_local+0x1c/0x50
[ 1.449049] [<c1014982>] prepare_ftrace_return+0x72/0xa0
[ 1.449049] [<c1007a26>] ? native_sched_clock+0x6/0x70
[ 1.449049] [<c1007a26>] ? native_sched_clock+0x6/0x70
...
or
[ 0.126297] Testing tracer function: PASSED
[ 0.260366] Testing dynamic ftrace: PASSED
[ 0.498381] Testing dynamic ftrace ops #1: (1 0 1 1 0) (1 1 2 1 0) (2 1 3 1 6015) (2 2 4 1 6190) PASSED
[ 0.557775] Testing dynamic ftrace ops #2: (1 0 1 6069 0) (1 1 2 6234 0) (2 1 3 1 2) (2 2 4 152 153) PASSED
[ 0.637461] Testing tracer irqsoff: PASSED
[ 0.650073] Testing tracer wakeup: PASSED
[ 0.979100] Testing tracer wakeup_rt:
[ 1.118393] Switching to clocksource tsc
[ 1.308726] PASSED
[ 1.309162] Testing tracer function_graph:
[ 1.317873] Tracing recursion: depth[576]:HC[1023]:SC[255]:NMI[67108864]
[ 1.317873] ------------[ cut here ]------------
[ 1.317873] WARNING: at kernel/trace/ring_buffer.c:2224 trace_recursive_fail+0x30/0x90()
[ 1.317873] Hardware name: Bochs
[ 1.317873] Pid: 1, comm: Not tainted 3.0.0-rc1-t43 #46
[ 1.317873] Call Trace:
[ 1.317873] [<c1022fbd>] warn_slowpath_common+0x6d/0xa0
[ 1.317873] [<c105b3c0>] ? trace_recursive_fail+0x30/0x90
[ 1.317873] [<c105b3c0>] ? trace_recursive_fail+0x30/0x90
[ 1.317873] [<c1023010>] warn_slowpath_null+0x20/0x30
[ 1.317873] [<c105b3c0>] trace_recursive_fail+0x30/0x90
[ 1.317873] [<c105ce8c>] ring_buffer_lock_reserve+0x8c/0xc0
[ 1.317873] [<c106177a>] trace_buffer_lock_reserve+0x1a/0x50
[ 1.317873] [<c1069357>] __trace_graph_return+0x37/0x90
[ 1.317873] <IRQ>
[ 1.317873] BUG: unable to handle kernel NULL pointer dereference at 00000040
[ 1.317873] IP: [<c1004f95>] print_context_stack+0x55/0x110
[ 1.317873] *pde = 00000000
[ 1.317873] BUG: unable to handle kernel NULL pointer dereference at 0000004a
[ 1.317873] IP: [<c1015e0e>] no_context+0x6e/0x160
[ 1.317873] *pde = 00000000
[ 1.317873] Thread overran stack, or stack corrupted
[ 1.317873] Oops: 0000 [#1]
[ 1.317873]
[ 1.317873] Pid: -947741524, comm: Not tainted 3.0.0-rc1-t43 #46 Bochs Bochs
[ 1.317873] EIP: 0060:[<c1015e0e>] EFLAGS: 00000006 CPU: 0
[ 1.317873] EIP is at no_context+0x6e/0x160
[ 1.317873] EAX: 00000002 EBX: c7829e78 ECX: c1023e56 EDX: c7829bc0
[ 1.317873] ESI: 00000040 EDI: 00000000 EBP: c7829da8 ESP: c7829d88
[ 1.317873] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[ 1.317873] Process (pid: -947741524, ti=c7828000 task=c7829bc0 task.ti=c10019a0)
[ 1.317873] Stack:
[ 1.317873] c117be6d 00000040 00000046 c7829bc0 c7829de8 00000000 00000040 c7829bc0
[ 1.317873] c7829de8 c1015f98 c1143805 c7829dd8 c1013c3d 00000000 0000004a 0000000e
[ 1.317873] 00000000 c1408242 00000040 c140820e c7829e78 c1408242 00000040 c140820e
[ 1.317873] Call Trace:
[ 1.317873] [<c1015f98>] __bad_area_nosemaphore+0x98/0x140
[ 1.317873] [<c1143805>] ? error_code+0x5d/0x64
[ 1.317873] [<c1013c3d>] ? prepare_ftrace_return+0x4d/0xa0
[ 1.317873] [<c1016057>] bad_area_nosemaphore+0x17/0x20
[ 1.317873] [<c10164ae>] do_page_fault+0x27e/0x410
[ 1.317873] [<c1023928>] ? console_unlock+0x1c8/0x1f0
[ 1.317873] [<c1016230>] ? mm_fault_error+0x130/0x130
[ 1.317873] [<c1016230>] ? mm_fault_error+0x130/0x130
...
Depending on the luck and boot time. All above was registered using qemu.
But I have this (or slightly different - depdnding on boot,
or kernel configuration - messages, just after trace function_graph testing
started) alsoon my laptop.
Reverting commit 44259b1abfaa8bb819d25d41d71e8e33e25dd36a on top of current
kernel make bug disapear.
Disabling CONFIG_FUNCTION_GRAPH_TRACER also make bug dispear.
Bug is preserved at multiple kernel configs.
I send one which is pretty minimized.
Regards,
Witek
--
Witold Baryluk
View attachment "config-minimized.txt" of type "text/plain" (20886 bytes)
Download attachment "oops-at-44259b1abfaa8bb819d25d41d71e8e33e25dd36a.txt.gz" of type "application/octet-stream" (5364 bytes)
Download attachment "signature.asc" of type "application/pgp-signature" (199 bytes)
Powered by blists - more mailing lists