lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090520171207.GA16706@elte.hu>
Date:	Wed, 20 May 2009 19:12:07 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Paul Mackerras <paulus@...ba.org>
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	linux-kernel@...r.kernel.org,
	Corey Ashford <cjashfor@...ux.vnet.ibm.com>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [RFC PATCH] perf_counter: dynamically allocate tasks'
	perf_counter_context struct


* Paul Mackerras <paulus@...ba.org> wrote:

> This replaces the struct perf_counter_context in the task_struct 
> with a pointer to a dynamically allocated perf_counter_context 
> struct.  The main reason for doing is this is to allow us to 
> transfer a perf_counter_context from one task to another when we 
> do lazy PMU switching in a later patch.

Hm, i'm not sure how far this gets us towards lazy PMU switching.

In fact i'd say that the term "lazy PMU switching" is probably 
misleading, we should use: "equivalent PMU context switching" or 
instead.

The difference is really crucial. We cannot really detach a PMU 
context from a task, because the task might migrate to another CPU 
and could run it there. Any lazyness in the switching of the PMU 
context would create the need to send IPIs and other overhead. For 
similar reasons are lazy FPU switching methods not workable on SMP 
generally.

Instead, the right abstraction is to define 'equivalency' between 
task's PMU contexts, created by inheritance. When two tasks 
context-switch that both have the same parent counter(s), we dont 
need to do _any_ physical PMU switching. The counts (and events) 
from one of the tasks can be freely transferred to the other task. 
It's going to get summarized in the parent anyway, so 
context-switching is an invariant.

To implement this, we need something like an 'ID', cookie or 
generation counter for the context, which changes to another unique 
number (or pointer) the moment a context is modified: a counter is 
added, removed or a counter attribute is changed. When counters are 
inherited the cookie gets carried over too. The context-switch code 
can then do this optimization:

	if (prev->ctx.cookie != next->ctx.cookie)
		switch_pmu_ctx(prev, next);

... which will be _very_ fast for the inherited counters (perf stat) 
case.

Note, this does put a few requirements on the architecture code, and 
it requires a few changes to the sched-in/sched-out code and 
requires a few changes to when tasks migrate to other CPUs.

For example the x86 code currently demuxes counter events back to 
counter pointers, using a per-cpu structure:

 struct cpu_hw_counters {
        struct perf_counter     *counters[X86_PMC_IDX_MAX];
        unsigned long           used_mask[BITS_TO_LONGS(X86_PMC_IDX_MAX)];
        unsigned long           active_mask[BITS_TO_LONGS(X86_PMC_IDX_MAX)];
        unsigned long           interrupts;
        int                     enabled;
 };

the counter pointers are per task - so this bit of cpu_hw_counters 
needs to move into the ctx structure, so that if an overflow IRQ 
comes in, we always only deal with local counters (not with some 
previous task's counter pointers).

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ