lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <E4F81A5E-C371-429D-AFEC-6F4B32971F09@fb.com>
Date:   Sat, 13 Oct 2018 08:31:37 +0000
From:   Song Liu <songliubraving@...com>
To:     Peter Zijlstra <peterz@...radead.org>
CC:     Ingo Molnar <mingo@...nel.org>,
        lkml <linux-kernel@...r.kernel.org>,
        "acme@...nel.org" <acme@...nel.org>,
        "alexander.shishkin@...ux.intel.com" 
        <alexander.shishkin@...ux.intel.com>,
        "jolsa@...hat.com" <jolsa@...hat.com>,
        "eranian@...gle.com" <eranian@...gle.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "alexey.budankov@...ux.intel.com" <alexey.budankov@...ux.intel.com>,
        "mark.rutland@....com" <mark.rutland@....com>,
        "megha.dey@...el.com" <megha.dey@...el.com>,
        "frederic@...nel.org" <frederic@...nel.org>
Subject: Re: [RFC][PATCH] perf: Rewrite core context handling



> On Oct 12, 2018, at 2:50 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> 
> 
> Can we please not top-post?
> 
> On Thu, Oct 11, 2018 at 10:37:14PM +0000, Song Liu wrote:
>> Thanks Peter! These are really really helpful. 
>> 
>> I am trying to think through the case of a group of two events on two 
>> separate hardware PMUs. In current implementation, this will not trigger
>> move_group,
> 
> Right, currently this is disallowed (or should be, I'll need to double
> check the code).
> 
>> so they will not be able to rotate together? And actually, 
>> they may not be able to run at all? Maybe this case is never supported? 
> 
> Indeed, we do not allow mixing events of different PMUs, with the
> explicit exception of software events. Since software events must always
> schedule, they're allowed to be fitted into any group.
> 
>> On the other hand, would something like this work:
>> 
>>    perf_cpu_context <-[1:2]-> perf_event_context <-[1:n]-> perf_event 
>>              |                                                |
>>              `----[1:n]---->     pmu    <----- [1:n]----------' 
>> 

After reading the code more, I think my idea in the figure above is 
similar to this patch. The "pmu" in the figure is actually  
perf_cpu_pmu_context. And I was thinking about something similar to 
current pmu (not in the figure above). 

I spent about two hours right here try to explain my idea. I ended 
up delete everything I typed and agree almost all your design 
decisions. 

I just realized that, if we don't allow group of events on two 
different hardware PMUs, the design of this patch works very well. 
Rotation of multiple PMUs at the same time is not necessary. 

The only suggestion I have right now is on which struct owns which
data:

1. perf_cpu_context owns two perf_event_context: ctx and *task_ctx. 
   This is the same as right now. 
2. perf_event_context owns multiple perf_event_pmu_context: 
   One perf_event_pmu_context for software groups;
   One perf_event_pmu_context for each hardware PMU.
3. perf_event_pmu_context owns RB tree of events. Since we don't 
   need rotation across multiple hardware PMUs, the rotation is 
   within same perf_event_pmu_context.  
4. perf_cpu_context owns multiple perf_cpu_pmu_context:
   One perf_cpu_pmu_context for each hardware PMU.
   perf_cpu_pmu_context is tot needed for software only groups(?).
5. perf_cpu_pmu_context has two pointers of perf_event_pmu_context.


The following diff (on top of this patch) shows the idea above. 
I don't think it changes any mechanism. But it feels simpler to me. 

Thanks,
Song

diff --git i/include/linux/perf_event.h w/include/linux/perf_event.h
index 462315239f8f..b15e679d4802 100644
--- i/include/linux/perf_event.h
+++ w/include/linux/perf_event.h
@@ -762,10 +762,7 @@ struct perf_event_context {
        struct mutex                    mutex;

        struct list_head                pmu_ctx_list;
-
-       struct perf_event_groups        pinned_groups;
-       struct perf_event_groups        flexible_groups;
-       struct list_head                event_list;
+       struct perf_event_pmu_context   sw_ctx;

        int                             nr_events;
        int                             nr_active;
@@ -806,7 +803,7 @@ struct perf_event_context {
 #define PERF_NR_CONTEXTS       4

 struct perf_cpu_pmu_context {
-       struct perf_event_pmu_context   epc;
+       struct perf_event_pmu_context   *epc;  /* I am still debating this one */
        struct perf_event_pmu_context   *task_epc;

        struct list_head                sched_cb_entry;
@@ -827,6 +824,7 @@ struct perf_cpu_pmu_context {
 struct perf_cpu_context {
        struct perf_event_context       ctx;
        struct perf_event_context       *task_ctx;
+       struct list_head                list_of_perf_cpu_pmu_context; /* may be removed? */

 #ifdef CONFIG_CGROUP_PERF
        struct perf_cgroup              *cgrp;
@@ -834,6 +832,10 @@ struct perf_cpu_context {
 #endif

        int                             online;
+
+       struct perf_event_groups        pinned_groups;
+       struct perf_event_groups        flexible_groups;
+       struct list_head                event_list;
 };

 struct perf_output_handle {


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ