lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1459215203-12885-1-git-send-email-eranian@google.com>
Date:	Tue, 29 Mar 2016 03:33:23 +0200
From:	Stephane Eranian <eranian@...gle.com>
To:	linux-kernel@...r.kernel.org
Cc:	acme@...hat.com, peterz@...radead.org, mingo@...e.hu,
	ak@...ux.intel.com, kan.liang@...el.com, jolsa@...hat.com,
	namhyung@...nel.org, vincent.weaver@...ne.edu
Subject: [PATCH] perf/core: Fix time tracking bug with multiplexing

This patch fixes a bug introduced by:

commit 3cbaa59069677920186dcf502632ca1df4329f80
Author: Peter Zijlstra <peterz@...radead.org>
Date:   Wed Feb 24 18:45:47 2016 +0100

    perf: Fix ctx time tracking by introducing EVENT_TIME

This patch introduce a bug in the time tracking of events when multiplexing is used.

The issue is easily reproducible with the following perf run:

 $ perf stat -a -C 0 -e branches,branches,branches,branches,branches -I 1000
     1.000730239            652,394      branches   (66.41%)
     1.000730239            597,809      branches   (66.41%)
     1.000730239            593,870      branches   (66.63%)
     1.000730239            651,440      branches   (67.03%)
     1.000730239            656,725      branches   (66.96%)
     1.000730239      <not counted>      branches

One branches event is shown as not having run. Yet, with multiplexing, all events
should run especially with a 1s (-I 1000) interval. The delta for time_running
comes out to 0. Yet, the event has run because the kernel is actually multiplexing
the events. The problem is that the time tracking is the kernel and especially in
ctx_sched_out() is wrong now.

The problem is that in case that the kernel enters ctx_sched_out() with the
following state:
   ctx->is_active=0x7 event_type=0x1
   Call Trace:
    [<ffffffff813ddd41>] dump_stack+0x63/0x82
    [<ffffffff81182bdc>] ctx_sched_out+0x2bc/0x2d0
    [<ffffffff81183896>] perf_mux_hrtimer_handler+0xf6/0x2c0
    [<ffffffff811837a0>] ? __perf_install_in_context+0x130/0x130
    [<ffffffff810f5818>] __hrtimer_run_queues+0xf8/0x2f0
    [<ffffffff810f6097>] hrtimer_interrupt+0xb7/0x1d0
    [<ffffffff810509a8>] local_apic_timer_interrupt+0x38/0x60
    [<ffffffff8175ca9d>] smp_apic_timer_interrupt+0x3d/0x50
    [<ffffffff8175ac7c>] apic_timer_interrupt+0x8c/0xa0

In that case, the test:
	if (is_active & EVENT_TIME)

will be false and the time will not be updated. Time must always be updated on
sched out. This patch fixes the problem.

Patch is relative to PeterZ queue.tip perf/core branch.

Signed-off-by: Stephane Eranian <eranian@...gle.com>
---
 kernel/events/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 339aa46..4ba41f6 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2447,7 +2447,7 @@ static void ctx_sched_out(struct perf_event_context *ctx,
 
 	is_active ^= ctx->is_active; /* changed bits */
 
-	if (is_active & EVENT_TIME) {
+	if (ctx->is_active & EVENT_TIME) {
 		/* update (and stop) ctx time */
 		update_context_time(ctx);
 		update_cgrp_time_from_cpuctx(cpuctx);
-- 
2.5.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ