[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrWoB9UNThOdbMjquA0uBhU74DqF2RB9ERX1Xi_ujyOwEA@mail.gmail.com>
Date: Mon, 13 Mar 2017 09:44:02 -0700
From: Andy Lutomirski <luto@...nel.org>
To: Vince Weaver <vincent.weaver@...ne.edu>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>
Subject: Re: perf: race with automatic rdpmc() disabling
On Mon, Mar 13, 2017 at 6:58 AM, Vince Weaver <vincent.weaver@...ne.edu> wrote:
> Hello
>
> I've been trying to track this issue down for a few days and haven't been
> able to isolate it. So maybe someone who understands low-level perf mmap
> reference counting can help here.
>
> As you might recall, 7911d3f7af14a614617e38245fedf98a724e46a9
> introduced automatic disabling of userspace rdpmc when no perf_events
> were running.
>
> I've run into a problem with PAPI when using rdpmc. If you have PAPI
> measuring events in multiple pthread threads, sometimes (but not always)
> the program will GPF because CR4/rdpmc gets turned off while events are
> still active.
>
> I've been trying to put together a reproducible test case but haven't been
> able to manage. I have a PAPI test that will show the problem about
> 50% of the time but I can't seem to isolate the problem.
>
> Any ideas?
>
> If you really want to try to reproduce it, get the current git version of
> PAPI
> git clone https://bitbucket.org/icl/papi.git
> edit src/components/perf_event/perf_event.c
> so that #define PERF_USE_RDPMC 1
> in src run ./configure , make
> then run the ./ctests/zero_pthreads test a few times. It will GPF and I'm
> relatively (though not entirely) sure it's not a PAPI issue.
> The problem does go away if you set /sys/devices/cpu/rdpmc to 2
Hmm
static void x86_pmu_event_mapped(struct perf_event *event)
{
if (!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED))
return;
if (atomic_inc_return(¤t->mm->context.perf_rdpmc_allowed) == 1)
<-- thread 1 stalls here
on_each_cpu_mask(mm_cpumask(current->mm), refresh_pce, NULL, 1);
}
Suppose you start with perf_rdpmc_allowed == 0. Thread 1 runs
x86_pmu_event_mapped and gets preempted (or just runs slowly) where I
marked. Then thread 2 runs the whole function, does *not* update CR4,
returns to userspace, and GPFs.
The big hammer solution is to stick a per-mm mutex around it. Let me
ponder whether a smaller hammer is available.
Powered by blists - more mailing lists