linux-kernel - Re: [PATCH v2] perf/core: set cgroup in cpu contexts for new cgroup events

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALcN6mhb0S+R5G1A5F4cqHeZK0cQGq7RDAdEM-J-xHvTF9xMBg@mail.gmail.com>
Date:	Wed, 3 Aug 2016 11:51:01 -0700
From:	David Carrillo-Cisneros <davidcc@...gle.com>
To:	Vegard Nossum <vegard.nossum@...il.com>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	"x86@...nel.org" <x86@...nel.org>, Ingo Molnar <mingo@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andi Kleen <ak@...ux.intel.com>,
	Kan Liang <kan.liang@...el.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Paul Turner <pjt@...gle.com>,
	Stephane Eranian <eranian@...gle.com>
Subject: Re: [PATCH v2] perf/core: set cgroup in cpu contexts for new cgroup events

Hi Vegard,

I don't think this patch fixes your bug, but it touches some code that
may be related.

David

On Wed, Aug 3, 2016 at 8:46 AM, Vegard Nossum <vegard.nossum@...il.com> wrote:
> Hi,
>
> On 2 August 2016 at 09:48, David Carrillo-Cisneros <davidcc@...gle.com> wrote:
>> There is an optimization in perf_cgroup_sched_{in,out} that skips the
>> switch of cgroup events if the old and new cgroups in a task switch are
>> the same. This optimization interacts with the current code in two ways
>> that cause a cpu context's cgroup (cpuctx->cgrp) to be NULL even if a
>> cgroup event matches the current task. These are:
>>
>>   1. On creation of the first cgroup event in a CPU: In current code,
>>   cpuctx->cpu is only set in perf_cgroup_sched_in, but due to the
>>   aforesaid optimization, perf_cgroup_sched_in will run until the next
>>   cgroup switches in that cpu. This may happen late or never happen,
>>   depending on system's number of cgroups, cpu load, etc.
>>
>>   2. On deletion of the last cgroup event in a cpuctx: In list_del_event,
>>   cpuctx->cgrp is set NULL. Any new cgroup event will not be sched in
>>   because cpuctx->cgrp == NULL until a cgroup switch occurs and
>>   perf_cgroup_sched_in is executed (updating cpuctx->cgrp).
>>
>> This patch fixes both problems by setting cpuctx->cgrp in list_add_event,
>> mirroring what list_del_event does when removing a cgroup event from CPU
>> context, as introduced in:
>> commit 68cacd29167b ("perf_events: Fix stale ->cgrp pointer in
>> update_cgrp_time_from_cpuctx()")
>>
>> With this patch, cpuctx->cgrp is always set/clear when installing/removing
>> the first/last cgroup event in/from the cpu context. With cpuctx->cgrp
>> correctly set, event_filter_match works as intended when events are
>> sched in/out.
>>
>> The problem is easy to observe in a machine with only one cgroup:
>>
>>   $ perf stat -e cycles -I 1000 -C 0 -G /
>>   #          time             counts unit events
>>       1.000161699      <not counted>      cycles                    /
>>       2.000355591      <not counted>      cycles                    /
>>       3.000565154      <not counted>      cycles                    /
>>       4.000951350      <not counted>      cycles                    /
>>
>> After the fix, the output is as expected:
>>
>>   $ perf stat -e cycles -I 1000 -a -G /
>>   #         time             counts unit events
>>      1.004699159          627342882      cycles                    /
>>      2.007397156          615272690      cycles                    /
>>      3.010019057          616726074      cycles                    /
>>
>> Rebased at peterz/queue/perf/core.
>>
>> Changes in v2:
>>   - Fix build error when no CONFIG_CGROUP_PERF.
>>   - Unify add and del cases into list_update_cgroup_event.
>>   - Remove cgroup exclusive variables from builds
>>   without CONFIG_CGROUP_PERF.
>>
>> Signed-off-by: David Carrillo-Cisneros <davidcc@...gle.com>
>
> Is this supposed to fix the bug I reported here?
>
> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1197757.html
>
> If so, you may want to add:
>
> 1) Fixes: f2fb6bef92514 ("perf/core: Optimize side-band event delivery")
> 2) Reported-by: Vegard Nossum <vegard.nossum@...cle.com>
> 3) a link to the thread
>
> and I can give it a test to see if it fixes the problem I was running into.
>
> If not, please ignore :-)
>
> Thanks,
>
>
> Vegard