linux-kernel - RE: [RFC 3/6] perf/core: use rb-tree to sched in event groups

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <37D7C6CF3E00A74B8858931C1DB2F07753698272@SHSMSX103.ccr.corp.intel.com>
Date:   Wed, 11 Jan 2017 20:31:11 +0000
From:   "Liang, Kan" <kan.liang@...el.com>
To:     Mark Rutland <mark.rutland@....com>,
        David Carrillo-Cisneros <davidcc@...gle.com>,
        Peter Zijlstra <peterz@...radead.org>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "x86@...nel.org" <x86@...nel.org>, Ingo Molnar <mingo@...hat.com>,
        "Thomas Gleixner" <tglx@...utronix.de>,
        Andi Kleen <ak@...ux.intel.com>,
        "Borislav Petkov" <bp@...e.de>,
        Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Vikas Shivappa <vikas.shivappa@...ux.intel.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Vince Weaver <vince@...ter.net>, Paul Turner <pjt@...gle.com>,
        "Stephane Eranian" <eranian@...gle.com>
Subject: RE: [RFC 3/6] perf/core: use rb-tree to sched in event groups



.
> 
> Kan, in your per-cpu event list patch you mentioned that you saw a large
> overhead in perf_iterate_ctx() when skipping events for other CPUs.
> Which callers of perf_iterate_ctx() specifically was that problematic for? Do
> those callers only care about the *active* events, for example?
> 

Based on my test, the large overhead was observed in perf_iterate_sb.
Yes, it only cares about the *active* events.

> Maybe the overhead of skipping !current_cpu events is ok at sched_in time
> in most cases. If the overhead of skipping those only matters for a subset of
> perf_iterate_ctx() callers, then maybe we can optimise them in another
> fashion (e.g. use the active events lists, or a new list specific to that iterate
> user, depending on what they actually need).
> That way we can drop cpu from the sort.
> 
> > The rb-tree allows us to find events with minimum and maximum
> > timestamp for a given CPU/cgroup + flexible type. The list
> > ctx->inactive_groups is sorted by timestamp.
> >
> > We could find a list position for the first event of each CPU/cgroup
> > that is to be scheduled and iterate over all of them, selecting events
> > from the list's head with the smallest timestampt, but it's too complicated.
> >
> > A simpler alternative is to find the smallest subinterval of
> > ctx->inactive_groups that contains all eligible events. Let's call
> > ctx->this
> > minimum subinterval S.
> >
> > S is formed of smaller subintervals, no necessarily exclusive, intervals.
> > Each one has all the events that are eligible for a given CPU or cgroup.
> > We find S by searching for the start/end of each one of these
> > CPU/cgroup subintervals and combining them. The drawback is that there
> > may be events in S that are not eligible (since ctx->inactive_group is
> > in stamp order).
> 
> The other drawback is that this is not fair, since CPU comes before runtime
> in the sort order. You'll always try some events before others (e.g. cpu == -1
> before cpu == current), before considering runtime. I believe this means
> that events can be permanently starved.
> 
> So either we need to fold those together somehow, or drop CPU from the
> sort order (assuming that we can, as above).
> 
> Thanks,
> Mark.