[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f66f6738-c612-1535-3661-dcc654750fa8@linux.intel.com>
Date: Tue, 19 Jun 2018 09:39:02 +0800
From: "Jin, Yao" <yao.jin@...ux.intel.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Mark Rutland <mark.rutland@....com>, acme@...nel.org,
jolsa@...nel.org, mingo@...hat.com,
alexander.shishkin@...ux.intel.com, me@...ehuey.com,
Linux-kernel@...r.kernel.org, vincent.weaver@...ne.edu,
will.deacon@....com, eranian@...gle.com, namhyung@...nel.org,
ak@...ux.intel.com, kan.liang@...el.com, yao.jin@...el.com
Subject: Re: [PATCH v1 1/2] perf/core: Use sysctl to turn on/off dropping
leaked kernel samples
On 6/18/2018 6:45 PM, Peter Zijlstra wrote:
> On Mon, Jun 18, 2018 at 02:55:32PM +0800, Jin, Yao wrote:
>> Thanks for providing the patch. I understand this approach.
>>
>> In my opinion, the skid window is from counter overflow to interrupt
>> delivered. While if the skid window is too *big* (e.g. user -> kernel), it
>> should be not very useful. So personally, I'd prefer to drop the samples.
>
> I really don't get your insitence on dropping the sample. Dropping
> samples is bad. Furthermore, doing what Mark suggests actually improves
> the result by reducing the skid, if the event happened before we entered
> (as it damn well should) then the user regs, which point at the entry
> site, are a better approximation than our in-kernel set.
>
> So not only do you not loose the sample, you actually get a better
> sample.
>
OK, that's fine, thanks!
I guess Mark will post this patch, right?
Anyway looks we don't need following patch (0-stuffing sample->ip to
indicate perf tool that it is a leak sample), right?
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 80cca2b..628b515 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6361,6 +6361,21 @@ perf_callchain(struct perf_event *event, struct
pt_regs *regs)
return callchain ?: &__empty_callchain;
}
+static bool sample_is_leaked(struct perf_event *event, struct pt_regs
*regs)
+{
+ /*
+ * Due to interrupt latency (AKA "skid"), we may enter the
+ * kernel before taking an overflow, even if the PMU is only
+ * counting user events.
+ * To avoid leaking information to userspace, we must always
+ * reject kernel samples when exclude_kernel is set.
+ */
+ if (event->attr.exclude_kernel && !user_mode(regs))
+ return true;
+
+ return false;
+}
+
void perf_prepare_sample(struct perf_event_header *header,
struct perf_sample_data *data,
struct perf_event *event,
@@ -6480,6 +6495,9 @@ void perf_prepare_sample(struct perf_event_header
*header,
if (sample_type & PERF_SAMPLE_PHYS_ADDR)
data->phys_addr = perf_virt_to_phys(data->addr);
+
+ if (sample_is_leaked(event, regs))
+ data->ip = 0;
}
static void __always_inline
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index bfa60bc..1bfb697 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -404,6 +404,7 @@ struct events_stats {
u64 total_aux_lost;
u64 total_aux_partial;
u64 total_invalid_chains;
+ u64 total_dropped_samples;
u32 nr_events[PERF_RECORD_HEADER_MAX];
u32 nr_non_filtered_samples;
u32 nr_lost_warned;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 8b93693..ec923f1 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1269,6 +1269,12 @@ static int machines__deliver_event(struct
machines *machines,
++evlist->stats.nr_unprocessable_samples;
return 0;
}
+
+ if (sample->ip == 0) {
+ /* Drop the leaked kernel samples */
+ ++evlist->stats.total_dropped_samples;
+ return 0;
+ }
return perf_evlist__deliver_sample(evlist, tool, event,
sample, evsel, machine);
case PERF_RECORD_MMAP:
return tool->mmap(tool, event, sample, machine);
Thanks
Jin Yao
Powered by blists - more mailing lists