[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201123142321.GP3021@hirez.programming.kicks-ass.net>
Date: Mon, 23 Nov 2020 15:23:21 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Ingo Molnar <mingo@...nel.org>, Borislav Petkov <bp@...en8.de>,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
LKML <linux-kernel@...r.kernel.org>,
Stephane Eranian <eranian@...gle.com>,
Kan Liang <kan.liang@...ux.intel.com>,
John Sperbeck <jsperbeck@...gle.com>,
"Lendacky, Thomas" <Thomas.Lendacky@....com>
Subject: Re: [RFC] perf/x86: Fix a warning on x86_pmu_stop()
On Sat, Nov 21, 2020 at 11:50:11AM +0900, Namhyung Kim wrote:
> When large PEBS is enabled, the below warning is triggered:
>
> [6070379.453697] WARNING: CPU: 23 PID: 42379 at arch/x86/events/core.c:1466 x86_pmu_stop+0x95/0xa0
> ...
> [6070379.453831] Call Trace:
> [6070379.453840] x86_pmu_del+0x50/0x150
> [6070379.453845] event_sched_out.isra.0+0x95/0x200
> [6070379.453848] group_sched_out.part.0+0x53/0xd0
> [6070379.453851] __perf_event_disable+0xee/0x1e0
> [6070379.453854] event_function+0x89/0xd0
> [6070379.453859] remote_function+0x3e/0x50
> [6070379.453866] generic_exec_single+0x91/0xd0
> [6070379.453870] smp_call_function_single+0xd1/0x110
> [6070379.453874] event_function_call+0x11c/0x130
> [6070379.453877] ? task_ctx_sched_out+0x20/0x20
> [6070379.453880] ? perf_mux_hrtimer_handler+0x370/0x370
> [6070379.453882] ? event_function_call+0x130/0x130
> [6070379.453886] perf_event_for_each_child+0x34/0x80
> [6070379.453889] ? event_function_call+0x130/0x130
> [6070379.453891] _perf_ioctl+0x24b/0x6a0
> [6070379.453898] ? sched_setaffinity+0x1ad/0x2a0
> [6070379.453904] ? _cond_resched+0x15/0x30
> [6070379.453906] perf_ioctl+0x3d/0x60
> [6070379.453912] ksys_ioctl+0x87/0xc0
> [6070379.453917] __x64_sys_ioctl+0x16/0x20
> [6070379.453923] do_syscall_64+0x52/0x180
> [6070379.453928] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> The commit 3966c3feca3f ("x86/perf/amd: Remove need to check "running"
> bit in NMI handler") introduced this. It seems x86_pmu_stop can be
> called recursively (like when it losts some samples) like below:
>
> x86_pmu_stop
> intel_pmu_disable_event (x86_pmu_disable)
> intel_pmu_pebs_disable
> intel_pmu_drain_pebs_buffer
> x86_pmu_stop
>
This shouldn't be possible; intel_pmu_drain_pebs_buffer() calls
drain_pebs(.iregs=NULL), which means that __intel_pmu_pebs_event()
should not end up x86_pmu_stop().
Are you running some old kernel?
Powered by blists - more mailing lists