[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <37D7C6CF3E00A74B8858931C1DB2F07753698272@SHSMSX103.ccr.corp.intel.com>
Date: Wed, 11 Jan 2017 20:31:11 +0000
From: "Liang, Kan" <kan.liang@...el.com>
To: Mark Rutland <mark.rutland@....com>,
David Carrillo-Cisneros <davidcc@...gle.com>,
Peter Zijlstra <peterz@...radead.org>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>, Ingo Molnar <mingo@...hat.com>,
"Thomas Gleixner" <tglx@...utronix.de>,
Andi Kleen <ak@...ux.intel.com>,
"Borislav Petkov" <bp@...e.de>,
Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Vikas Shivappa <vikas.shivappa@...ux.intel.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Vince Weaver <vince@...ter.net>, Paul Turner <pjt@...gle.com>,
"Stephane Eranian" <eranian@...gle.com>
Subject: RE: [RFC 3/6] perf/core: use rb-tree to sched in event groups
.
>
> Kan, in your per-cpu event list patch you mentioned that you saw a large
> overhead in perf_iterate_ctx() when skipping events for other CPUs.
> Which callers of perf_iterate_ctx() specifically was that problematic for? Do
> those callers only care about the *active* events, for example?
>
Based on my test, the large overhead was observed in perf_iterate_sb.
Yes, it only cares about the *active* events.
> Maybe the overhead of skipping !current_cpu events is ok at sched_in time
> in most cases. If the overhead of skipping those only matters for a subset of
> perf_iterate_ctx() callers, then maybe we can optimise them in another
> fashion (e.g. use the active events lists, or a new list specific to that iterate
> user, depending on what they actually need).
> That way we can drop cpu from the sort.
>
> > The rb-tree allows us to find events with minimum and maximum
> > timestamp for a given CPU/cgroup + flexible type. The list
> > ctx->inactive_groups is sorted by timestamp.
> >
> > We could find a list position for the first event of each CPU/cgroup
> > that is to be scheduled and iterate over all of them, selecting events
> > from the list's head with the smallest timestampt, but it's too complicated.
> >
> > A simpler alternative is to find the smallest subinterval of
> > ctx->inactive_groups that contains all eligible events. Let's call
> > ctx->this
> > minimum subinterval S.
> >
> > S is formed of smaller subintervals, no necessarily exclusive, intervals.
> > Each one has all the events that are eligible for a given CPU or cgroup.
> > We find S by searching for the start/end of each one of these
> > CPU/cgroup subintervals and combining them. The drawback is that there
> > may be events in S that are not eligible (since ctx->inactive_group is
> > in stamp order).
>
> The other drawback is that this is not fair, since CPU comes before runtime
> in the sort order. You'll always try some events before others (e.g. cpu == -1
> before cpu == current), before considering runtime. I believe this means
> that events can be permanently starved.
>
> So either we need to fold those together somehow, or drop CPU from the
> sort order (assuming that we can, as above).
>
> Thanks,
> Mark.
Powered by blists - more mailing lists