[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1270762974.20295.3185.camel@laptop>
Date: Thu, 08 Apr 2010 23:42:54 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Stephane Eranian <eranian@...gle.com>
Cc: linux-kernel@...r.kernel.org, mingo@...e.hu, paulus@...ba.org,
davem@...emloft.net, fweisbec@...il.com, robert.richter@....com,
perfmon2-devel@...ts.sf.net, eranian@...il.com
Subject: Re: [PATCH] perf_events: fix bogus warn_on(_once) in
perf_prepare_sample()
On Thu, 2010-04-08 at 23:29 +0200, Stephane Eranian wrote:
> On Thu, Apr 8, 2010 at 11:15 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> > On Thu, 2010-04-08 at 23:08 +0200, Stephane Eranian wrote:
> >> On Thu, Apr 8, 2010 at 10:55 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> >> > On Thu, 2010-04-08 at 22:45 +0200, Stephane Eranian wrote:
> >> >> There is a warn_on_once() check for PERF_SAMPLE_RAW which trips
> >> >> when using PEBS on both Core and Nehalem. Core PEBS sample size is 144
> >> >> bytes and 176 bytes for Nehalem. Both are multiples of 8, but the size
> >> >> field is encoded as int, thus the total is never a multiple of 8 which
> >> >> trips the check. I think the size should have been u64, but now it is
> >> >> too late to change given it is ABI.
> >> >
> >> > PEBS hasn't seen -linus yet, so we can fix that.
> >> >
> >> Are you suggesting you add some padding the PEBS raw sample you
> >> return as PERF_SAMPLE_RAW? Then you need to define what RAW
> >> actually means? Seems here, it would mean more than what the
> >> HW returns.
> >
> > The best fix here is to simply remove the PERF_SAMPLE_RAW support and
> > implement PERF_SAMPLE_REGS. Its just that we need to come up with a way
> > to deal with compat pt_regs muck.
> >
> >
> But isn't RAW already part of the ABI?
The existance of it, yes, the content less so.
> It is true that you need to also provide the interrupted state (or
> part of it) to the user, i.e., beyond just the IP.
Why?
The only thing I would do is maybe use the interrupt state to fill out
the PEBS reg data, its mostly things like segment regs that go missing,
and for those the interrupt state ought to be good enough.
> But the interrupt state and PEBS state are distinct and should not be
> swapped. In fact, both may be useful at the same time to evaluate the skid.
That doesn't seem like something a regular person would be interested
in, and for those who do they can easily hack the kernel.
I don't think that is a use-case worth expanding the ABI over.
> So yes, you need PERF_SAMPLE_REGS. But just like your proposal with the two IPs.
> I think you need to export two versions of REGS: IREGS and PREGS for
> instance. The issue is that PEBS does not record the whole state but only a very small subset.
>
> Then, on Nehalem, there is PEBS more where you collect more that just
> machine state, you collect cache miss information. That's not regs anymore.
Yes, the whole load-latency train-wreck is something we need to come up
with a sensible interface for, that's definitely not something I would
do through RAW.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists