[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090205142229.GB28443@elte.hu>
Date: Thu, 5 Feb 2009 15:22:29 +0100
From: Ingo Molnar <mingo@...e.hu>
To: Paul Mackerras <paulus@...ba.org>
Cc: "Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] perf_counter: Prevent oopses from per-cpu software
counters
* Paul Mackerras <paulus@...ba.org> wrote:
> Impact: oops fix
>
> Yanmin Zhang reported that using a PERF_COUNT_TASK_CLOCK software
> counter as a per-cpu counter would reliably crash the system, because
> it calls __task_delta_exec with a null pointer. And indeed, a "task
> clock" counter only makes sense as a per-task counter. Similarly,
> counting page faults, context switches or cpu migrations only makes
> sense for a per-task counter.
>
> This fixes the problem by disallowing the use of the task clock,
> page fault, context switch and cpu migration software counters as
> per-cpu counters, since they all require a task context to obtain their
> data. The only software counter that can be used as a per-cpu counter
> is the cpu clock counter (PERF_COUNT_CPU_CYCLES).
>
> In order for sw_perf_counter_init to be able to tell whether we are
> setting up a per-task or a per-cpu counter, this arranges for counter->ctx
> to be initialized earlier, in perf_counter_alloc.
>
> The other minor change this makes is to ensure that if sw_perf_counter_init
> fails, we don't try to initialize the counter as a hardware counter.
> Since the user has passed a negative event type (and it isn't raw), they
> clearly don't intend it to be interpreted as a hardware event. This
> matters now that sw_perf_counter_init can fail for valid software event
> types (because of the check that the counter is a per-task counter).
Hm, i dont really think that the notion that it should not be possible to
use sw counters on a per CPU basis is valid.
You are right that "pagefaults" and "context switches" do get generated by
tasks - but there is a per cpu and system wide notion of 'number of
pagefaults', and people might be interested in monitoring that.
The existence and widespread use of "vmstat", and its display of system-wide
count of "context switches" (and administrator's reliance on judging a
workload based on those counts) is i think ample proof that it makes sense
to have those counters on a per CPU basis too.
So how about fixing these sw counts to properly work as percpu counters too?
Or am i misssing something subtle that makes that impossible?
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists