lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <98A6264C-B833-4930-95A0-2A3186519D87@fb.com>
Date:   Tue, 5 Nov 2019 17:11:08 +0000
From:   Song Liu <songliubraving@...com>
To:     Peter Zijlstra <peterz@...radead.org>
CC:     open list <linux-kernel@...r.kernel.org>,
        Kernel Team <Kernel-team@...com>,
        "acme@...nel.org" <acme@...nel.org>,
        "Arnaldo Carvalho de Melo" <acme@...hat.com>,
        Jiri Olsa <jolsa@...nel.org>,
        Alexey Budankov <alexey.budankov@...ux.intel.com>,
        Namhyung Kim <namhyung@...nel.org>, "Tejun Heo" <tj@...nel.org>
Subject: Re: [PATCH v6] perf: Sharing PMU counters across compatible events


Hi Peter, 

> On Oct 31, 2019, at 9:29 AM, Song Liu <songliubraving@...com> wrote:
> 
>> On Oct 31, 2019, at 5:43 AM, Peter Zijlstra <peterz@...radead.org> wrote:
>> 
>> On Wed, Sep 18, 2019 at 10:23:14PM -0700, Song Liu wrote:
>>> This patch tries to enable PMU sharing. To make perf event scheduling
>>> fast, we use special data structures.
>>> 
>>> An array of "struct perf_event_dup" is added to the perf_event_context,
>>> to remember all the duplicated events under this ctx. All the events
>>> under this ctx has a "dup_id" pointing to its perf_event_dup. Compatible
>>> events under the same ctx share the same perf_event_dup. The following
>>> figure shows a simplified version of the data structure.
>>> 
>>>     ctx ->  perf_event_dup -> master
>>>                    ^
>>>                    |
>>>        perf_event /|
>>>                    |
>>>        perf_event /
>>> 
>>> Connection among perf_event and perf_event_dup are built when events are
>>> added or removed from the ctx. So these are not on the critical path of
>>> schedule or perf_rotate_context().
>>> 
>>> On the critical paths (add, del read), sharing PMU counters doesn't
>>> increase the complexity. Helper functions event_pmu_[add|del|read]() are
>>> introduced to cover these cases. All these functions have O(1) time
>>> complexity.
>>> 
>>> We allocate a separate perf_event for perf_event_dup->master. This needs
>>> extra attention, because perf_event_alloc() may sleep. To allocate the
>>> master event properly, a new pointer, tmp_master, is added to perf_event.
>>> tmp_master carries a separate perf_event into list_[add|del]_event().
>>> The master event has valid ->ctx and holds ctx->refcount.
>> 
>> That is realy nasty and expensive, it basically means every !sampling
>> event carries a double allocate.
>> 
>> Why can't we use one of the actual events as master?
> 
> I think we can use one of the event as master. We need to be careful when
> the master event is removed, but it should be doable. Let me try. 

Actually, there is a bigger issue when we use one event as the master: what
shall we do if the master event is not running? Say it is an cgroup event, 
and the cgroup is not running on this cpu. An extra master (and all these
array hacks) help us get O(1) complexity in such scenario. 

Do you have suggestions on how to solve this problem? Maybe we can keep the 
extra master, and try get rid of the double alloc? 

Thanks,
Song


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ