[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150125043428.GA6109@wfg-t540p.sh.intel.com>
Date:	Sat, 24 Jan 2015 20:34:28 -0800
From:	Fengguang Wu <fengguang.wu@...el.com>
To:	Mark Rutland <mark.rutland@....com>
Cc:	Peter Zijlstra <peterz@...radead.org>, LKP <lkp@...org>,
	linux-kernel@...r.kernel.org
Subject: [perf] WARNING: CPU: 0 PID: 1457 at kernel/events/core.c:890
 add_event_to_ctx()
Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/core
commit d26bb7f73a2881f2412c340a27438b185f0cc3dc
Author:     Mark Rutland <mark.rutland@....com>
AuthorDate: Wed Jan 7 15:01:54 2015 +0000
Commit:     Peter Zijlstra <peterz@...radead.org>
CommitDate: Fri Jan 23 15:17:56 2015 +0100
    perf: decouple unthrottling and rotating
    
    Currently the adjusments made as part of perf_event_task_tick use the
    percpu rotation lists to iterate over any active PMU contexts, but these
    are not used by the context rotation code, having been replaced by
    separate (per-context) hrtimer callbacks. However, some manipulation of
    the rotation lists (i.e. removal of contexts) has remained in
    perf_rotate_context. This leads to the following issues:
    
    * Contexts are not always removed from the rotation lists. Removal of
      PMUs which have been placed in rotation lists, but have not been
      removed by a hrtimer callback can result in corruption of the rotation
      lists (when memory backing the context is freed).
    
      This has been observed to result in hangs when PMU drivers built as
      modules are inserted and removed around the creation of events for
      said PMUs.
    
    * Contexts which do not require rotation may be removed from the
      rotation lists as a result of a hrtimer, and will not be considered by
      the unthrottling code in perf_event_task_tick.
    
    This patch solves these issues by moving any and all removal of contexts
    from rotation lists to only occur when the final event is removed from a
    context, mirroring the addition which only occurs when the first event
    is added to a context. The vestigal manipulation of the rotation lists
    is removed from perf_event_rotate_context.
    
    As the rotation_list variables are not used for rotation, these are
    renamed to active_ctx_list, which better matches their current function.
    perf_pmu_rotate_{start,stop} are renamed to
    perf_pmu_ctx_{activate,deactivate}.
    
    Cc: Will Deacon <will.deacon@....com>
    Cc: Paul Mackerras <paulus@...ba.org>
    Cc: Ingo Molnar <mingo@...hat.com>
    Cc: Arnaldo Carvalho de Melo <acme@...nel.org>
    Cc: Will Deacon <will.deacon@....com>
    Cc: Paul Mackerras <paulus@...ba.org>
    Cc: Ingo Molnar <mingo@...hat.com>
    Cc: Arnaldo Carvalho de Melo <acme@...nel.org>
    Cc: Mark Rutland <mark.rutland@....com>
    Signed-off-by: Mark Rutland <mark.rutland@....com>
    Reported-by: Johannes Jensen <johannes.jensen@....com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
    Link: http://lkml.kernel.org/r/1420642914-22760-1-git-send-email-mark.rutland@arm.com
===================================================
PARENT COMMIT NOT CLEAN. LOOK OUT FOR WRONG BISECT!
===================================================
Attached dmesg for the parent commit, too, to help confirm whether it is a noise error.
Fengguang: the old OOM errors look like independent noises. 
+------------------------------------------------------------------+------------+------------+------------+
|                                                                  | 2e67200461 | d26bb7f73a | b0f9997908 |
+------------------------------------------------------------------+------------+------------+------------+
| boot_successes                                                   | 0          | 0          | 0          |
| boot_failures                                                    | 1900       | 900        | 22         |
| page_allocation_failure:order:#,mode                             | 1040       | 567        | 10         |
| Kernel_panic-not_syncing:Out_of_memory_and_no_killable_processes | 1040       | 567        | 10         |
| backtrace:ring_buffer_consumer_thread                            | 1040       | 567        | 10         |
| backtrace:lock_torture_stats                                     | 1040       | 567        | 10         |
| WARNING:at_net/netlink/genetlink.c:#genl_unbind()                | 860        | 54         |            |
| backtrace:do_group_exit                                          | 860        | 5          |            |
| backtrace:SyS_exit_group                                         | 860        | 5          |            |
| backtrace:netlink_setsockopt                                     | 236        | 49         |            |
| backtrace:SyS_setsockopt                                         | 236        | 49         |            |
| backtrace:SyS_socketcall                                         | 236        | 49         |            |
| WARNING:at_kernel/events/core.c:#add_event_to_ctx()              | 0          | 333        | 12         |
| BUG:kernel_test_hang                                             | 0          | 333        | 12         |
| backtrace:inherit_group                                          | 0          | 328        | 12         |
| backtrace:perf_event_init_task                                   | 0          | 328        | 12         |
| backtrace:do_fork                                                | 0          | 328        | 12         |
| backtrace:SyS_clone                                              | 0          | 328        | 12         |
| backtrace:perf_install_in_context                                | 0          | 5          |            |
| backtrace:SyS_perf_event_open                                    | 0          | 5          |            |
+------------------------------------------------------------------+------------+------------+------------+
[main] Setsockopt(1 8 80d1000 4) on fd 86 [1:1:1]
[main] Setsockopt(1 2a 80d1000 4) on fd 87 [1:1:1]
[   34.700861] ------------[ cut here ]------------
[   34.701372] WARNING: CPU: 0 PID: 1457 at kernel/events/core.c:890 add_event_to_ctx+0x253/0x270()
[   34.702515] CPU: 0 PID: 1457 Comm: trinity-main Not tainted 3.19.0-rc4-gd26bb7f #2
[   34.702931]  00000000 00000000 c0911e2c cd8a61df c0911e48 cd052cfa 0000037a cd0d32f3
[   34.702931]  c0c206d0 c0c20590 d3c9e0a0 c0911e58 cd052dd4 00000009 00000000 c0911e78
[   34.702931]  cd0d32f3 d3c9e214 00000000 00000000 c0c20598 00000246 c0c50990 c0911e90
[   34.702931] Call Trace:
[   34.702931]  [<cd8a61df>] dump_stack+0x16/0x18
[   34.702931]  [<cd052cfa>] warn_slowpath_common+0x6a/0xa0
[   34.702931]  [<cd0d32f3>] ? add_event_to_ctx+0x253/0x270
[   34.702931]  [<cd052dd4>] warn_slowpath_null+0x14/0x20
[   34.702931]  [<cd0d32f3>] add_event_to_ctx+0x253/0x270
[   34.702931]  [<cd0da60f>] inherit_event+0xef/0x240
[   34.702931]  [<cd0da778>] inherit_group+0x18/0x70
[   34.702931]  [<cd0d2884>] ? alloc_perf_context+0x24/0x50
[   34.702931]  [<cd0db927>] perf_event_init_task+0x117/0x310
[   34.702931]  [<cd050c67>] copy_process+0x477/0x14f0
[   34.702931]  [<cd052063>] do_fork+0xb3/0x430
[   34.702931]  [<cd0923fd>] ? do_setitimer+0x13d/0x220
[   34.702931]  [<cd09251a>] ? alarm_setitimer+0x3a/0x60
[   34.702931]  [<cd05246b>] SyS_clone+0x1b/0x20
[   34.702931]  [<cd8ad3bd>] syscall_call+0x7/0x7
[   34.702931]  [<cd8a0000>] ? xen_chk_extra_mem+0x10/0x70
[   34.702931] ---[ end trace 19d6cac21f26a758 ]---
git bisect start b0f99979082f6aafe6f2d4342e44907a4bb6b710 ec6f34e5b552fb0a52e6aae1a5afbbb1605cc6cc --
git bisect  bad 7c4e3ef2ae4f008776d1d2d13c862179146bbb07  # 08:44      0-     28  Merge 'arm-platforms/irq/die-gic-arch-extn-die-die-die' into devel-roam-rand-201501240027
git bisect  bad 5dcbd81bc253c6fb786a3c4d0c2304d00353cc83  # 08:45      0-    928  Merge 'peterz-queue/perf/urgent' into devel-roam-rand-201501240027
git bisect good bd0e15d797d00b7115e1950ee13fec7ce001f064  # 09:09    900+    900  Merge 'peterz-queue/locking/core' into devel-roam-rand-201501240027
git bisect  bad d8c008a82490f75ca16101567e167213486288aa  # 10:14    351-    358  Merge 'peterz-queue/perf/core' into devel-roam-rand-201501240027
git bisect  bad 18966e0b34261132be50b8624be368db80b529cf  # 11:19    291-    293  perf, x86: use context switch callback to flush LBR stack
git bisect good 44b4c3b252ffefe36900df247d528e9550ee20c4  # 12:31    900+    900  perf: Add pmu callbacks to track event mapping and unmapping
git bisect  bad d26bb7f73a2881f2412c340a27438b185f0cc3dc  # 13:34    509-    510  perf: decouple unthrottling and rotating
git bisect good e8923a02fab8e3a2e74cebace2ae73cbf1f0dd09  # 14:00    900+    900  x86, perf: Only allow rdpmc if a perf_event is mapped
git bisect good 2e67200461d1eec17062de4947d07f3e6afd0848  # 14:26    900+    900  x86, perf: Add /sys/devices/cpu/rdpmc=2 to allow rdpmc for all tasks
# first bad commit: [d26bb7f73a2881f2412c340a27438b185f0cc3dc] perf: decouple unthrottling and rotating
git bisect good 2e67200461d1eec17062de4947d07f3e6afd0848  # 14:44   1000+   1900  x86, perf: Add /sys/devices/cpu/rdpmc=2 to allow rdpmc for all tasks
# extra tests with DEBUG_INFO
# extra tests on HEAD of linux-devel/devel-roam-rand-201501240027
git bisect  bad b0f99979082f6aafe6f2d4342e44907a4bb6b710  # 14:48      0-     22  0day head guard for 'devel-roam-rand-201501240027'
# extra tests on tree/branch peterz-queue/perf/core
git bisect  bad 6f637dfc22bc3e963c6936cdf1bb6550a9d3e955  # 16:02    274-    278  perf,powerpc: Fix up flush_branch_stack users
# extra tests on tree/branch linus/master
git bisect good c4e00f1d31c4c83d15162782491689229bd92527  # 17:12   1000+    644  Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
# extra tests on tree/branch next/master
git bisect good de3d2c5b941c632685ab58613f981bf14a42676f  # 17:23   1000+    528  Add linux-next specific files for 20150123
This script may reproduce the error.
----------------------------------------------------------------------------
#!/bin/bash
kernel=$1
initrd=yocto-minimal-i386.cgz
wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd
kvm=(
	qemu-system-x86_64
	-cpu kvm64
	-enable-kvm
	-kernel $kernel
	-initrd $initrd
	-m 320
	-smp 1
	-net nic,vlan=1,model=e1000
	-net user,vlan=1
	-boot order=nc
	-no-reboot
	-watchdog i6300esb
	-rtc base=localtime
	-serial stdio
	-display none
	-monitor null 
)
append=(
	hung_task_panic=1
	earlyprintk=ttyS0,115200
	debug
	apic=debug
	sysrq_always_enabled
	rcupdate.rcu_cpu_stall_timeout=100
	panic=-1
	softlockup_panic=1
	nmi_watchdog=panic
	oops=panic
	load_ramdisk=2
	prompt_ramdisk=0
	console=ttyS0,115200
	console=tty0
	vga=normal
	root=/dev/ram0
	rw
	drbd.minor_count=8
)
"${kvm[@]}" --append "${append[*]}"
----------------------------------------------------------------------------
Thanks,
Fengguang
View attachment "dmesg-yocto-ivb41-39:20150124134027:i386-randconfig-r2-0121:3.19.0-rc4-gd26bb7f:2" of type "text/plain" (316200 bytes)
View attachment "dmesg-quantal-client1-10:20150124141037:i386-randconfig-r2-0121:3.19.0-rc4-g2e67200:4" of type "text/plain" (147475 bytes)
View attachment "config-3.19.0-rc4-gd26bb7f" of type "text/plain" (88469 bytes)
_______________________________________________
LKP mailing list
LKP@...ux.intel.com
Powered by blists - more mailing lists
 
