[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131115145944.GA3694@twins.programming.kicks-ass.net>
Date: Fri, 15 Nov 2013 15:59:44 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Vince Weaver <vincent.weaver@...ne.edu>
Cc: LKML <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...nel.org>,
Paul Mackerras <paulus@...ba.org>,
Arnaldo Carvalho de Melo <acme@...stprotocols.net>
Subject: Re: perf sw_event related lockup
On Wed, Nov 13, 2013 at 05:45:59PM -0500, Vince Weaver wrote:
> Hello
>
> so with the perf_fuzzer modified to avoid the tracepoint issues, I've
> triggered this software-event related soft lockup.
>
> From what I can gather from the backtraces, they all map to the loop
> in do_perf_sw_event() in kernel/events/core.c
>
> hlist_for_each_entry_rcu(event, head, hlist_entry) {
> if (perf_swevent_match(event, type, event_id, data, regs))
> perf_swevent_event(event, nr, data, regs);
> }
>
> is it possible to get stuck in that as an infinite loop?
>
> below is the dmesg from the lockup, I eventually had to reboot to clear
> the problem:
> [ 416.755310] NOHZ: local_softirq_pending 100
> [ 452.232000] BUG: soft lockup - CPU#1 stuck for 23s! [perf_fuzzer:7211]
> [ 452.232000] RIP: 0010:[<ffffffff810caa4c>] [<ffffffff810caa4c>] __perf_sw_event+0x9a/0x1a5
> [ 452.232000] Call Trace:
> [ 452.232000] [<ffffffff81520fa3>] ? __do_page_fault+0x191/0x3f5
> [ 452.232000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
> [ 452.232000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
> [ 452.232000] [<ffffffff8151e232>] ? page_fault+0x22/0x30
> [ 452.232000] [<ffffffff8129cfe0>] ? __put_user_4+0x20/0x30
> [ 452.232000] [<ffffffff81066629>] ? schedule_tail+0x5c/0x60
> [ 452.232000] [<ffffffff81524b7f>] ? ret_from_fork+0xf/0xb0
> [ 480.232000] RIP: 0010:[<ffffffff810cab0c>] [<ffffffff810cab0c>] __perf_sw_event+0x15a/0x1a5
> [ 480.232000] Call Trace:
> [ 480.232000] [<ffffffff81520fa3>] ? __do_page_fault+0x191/0x3f5
> [ 480.232000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
> [ 480.232000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
> [ 480.232000] [<ffffffff8151e232>] ? page_fault+0x22/0x30
> [ 480.232000] [<ffffffff8129cfe0>] ? __put_user_4+0x20/0x30
> [ 480.232000] [<ffffffff81066629>] ? schedule_tail+0x5c/0x60
> [ 480.232000] [<ffffffff81524b7f>] ? ret_from_fork+0xf/0xb0
> [ 486.528000] RIP: 0010:[<ffffffff8129c063>] [<ffffffff8129c063>] delay_tsc+0x23/0x50
> [ 486.528000] Call Trace:
> [ 486.528000] <EOI> [<ffffffff810cab38>] ? __perf_sw_event+0x186/0x1a5
> [ 486.528000] [<ffffffff810cab40>] ? __perf_sw_event+0x18e/0x1a5
> [ 486.528000] [<ffffffff81520fa3>] ? __do_page_fault+0x191/0x3f5
> [ 486.528000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
> [ 486.528000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
> [ 486.528000] [<ffffffff8151e232>] ? page_fault+0x22/0x30
> [ 486.528000] [<ffffffff8129cfe0>] ? __put_user_4+0x20/0x30
> [ 486.528000] [<ffffffff81066629>] ? schedule_tail+0x5c/0x60
> [ 486.528000] [<ffffffff81524b7f>] ? ret_from_fork+0xf/0xb0
> [ 486.528000] Call Trace:
> [ 486.528000] [<ffffffff810cab38>] ? __perf_sw_event+0x186/0x1a5
> [ 486.528000] [<ffffffff810cab40>] ? __perf_sw_event+0x18e/0x1a5
> [ 486.528000] [<ffffffff81520fa3>] ? __do_page_fault+0x191/0x3f5
> [ 486.528000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
> [ 486.528000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
> [ 486.528000] [<ffffffff8151e232>] ? page_fault+0x22/0x30
> [ 486.528000] [<ffffffff8129cfe0>] ? __put_user_4+0x20/0x30
> [ 486.528000] [<ffffffff81066629>] ? schedule_tail+0x5c/0x60
> [ 486.528000] [<ffffffff81524b7f>] ? ret_from_fork+0xf/0xb0
Please enable CONFIG_FRAME_POINTER to get better backtraces, but the
above suggests the pagefault swevent, will have a look.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists