[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131119153142.GA10913@gmail.com>
Date: Tue, 19 Nov 2013 16:31:42 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: David Ahern <dsahern@...il.com>,
Namhyung Kim <namhyung@...nel.org>,
Arnaldo Carvalho de Melo <acme@...stprotocols.net>,
linux-kernel@...r.kernel.org, jolsa@...hat.com,
Frederic Weisbecker <fweisbec@...il.com>,
Mike Galbraith <efault@....de>,
Stephane Eranian <eranian@...gle.com>
Subject: Re: [PATCH 4/5] perf record: mmap output file - v5
* Peter Zijlstra <peterz@...radead.org> wrote:
> On Tue, Nov 19, 2013 at 02:13:04PM +0100, Ingo Molnar wrote:
> >
> > * Peter Zijlstra <peterz@...radead.org> wrote:
> >
> > > On Tue, Nov 19, 2013 at 12:48:10PM +0100, Peter Zijlstra wrote:
> > > > And that does indeed seem to side-step the perf sw pagefault event, but
> > > > that is arguably a perf bug.
> > >
> > > To clarify; mm/memory.c:handle_mm_fault() is where the VM counts its
> > > generic PGFAULT event, but our perf sw event is in the arch fault
> > > handler.
> > >
> > > So they count different but related things.
> >
> > I think that assymetry was intended: we didn't want to count
> > 'synchronous' pagefaults like get_user_pages() or mlock() bringing
> > in pages, only asynchronous/real ones, or so.
>
> OK, I couldn't remember.
>
> Anyway, I don't want to hold up this patch set, and the speedup in
> the 'normal' case is nice.
>
> The only reason I reacted was because the changelog mentioned
> avoiding a feedback loop -- so I obviously had to point out that it
> didn't do such a thing, it only changed the details of the loop.
So with MAP_POPULATE the 'feedback window' is moved entirely into the
kernel (to within a single syscall) and is also reduced significantly,
compared to a write() loop.
While you are right that it is not an elimination of the problem - yet
it is still a significant reduction of its cross section surface in
practice.
> I'm fairly certain this particular problem is unavoidable, no matter
> what the mechanism used, you can always create feedback.
Well, we could exclude the profiling task itself from profiling events
(just like ftrace and core bits of perf does it out of necessity), but
I intentionally wanted to avoid that, to make sure we are honest and
to make sure people don't tolerate profiling overhead that disturbs
other workloads.
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists