linux-kernel - Re: [PATCH] perf_counter: Prevent oopses from per-cpu software counters

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090205142229.GB28443@elte.hu>
Date:	Thu, 5 Feb 2009 15:22:29 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Paul Mackerras <paulus@...ba.org>
Cc:	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] perf_counter: Prevent oopses from per-cpu software
	counters


* Paul Mackerras <paulus@...ba.org> wrote:

> Impact: oops fix
> 
> Yanmin Zhang reported that using a PERF_COUNT_TASK_CLOCK software
> counter as a per-cpu counter would reliably crash the system, because
> it calls __task_delta_exec with a null pointer.  And indeed, a "task
> clock" counter only makes sense as a per-task counter.  Similarly,
> counting page faults, context switches or cpu migrations only makes
> sense for a per-task counter.
> 
> This fixes the problem by disallowing the use of the task clock,
> page fault, context switch and cpu migration software counters as
> per-cpu counters, since they all require a task context to obtain their
> data.  The only software counter that can be used as a per-cpu counter
> is the cpu clock counter (PERF_COUNT_CPU_CYCLES).
> 
> In order for sw_perf_counter_init to be able to tell whether we are
> setting up a per-task or a per-cpu counter, this arranges for counter->ctx
> to be initialized earlier, in perf_counter_alloc.
> 
> The other minor change this makes is to ensure that if sw_perf_counter_init
> fails, we don't try to initialize the counter as a hardware counter.
> Since the user has passed a negative event type (and it isn't raw), they
> clearly don't intend it to be interpreted as a hardware event.  This
> matters now that sw_perf_counter_init can fail for valid software event
> types (because of the check that the counter is a per-task counter).

Hm, i dont really think that the notion that it should not be possible to 
use sw counters on a per CPU basis is valid.

You are right that "pagefaults" and "context switches" do get generated by 
tasks - but there is a per cpu and system wide notion of 'number of 
pagefaults', and people might be interested in monitoring that.

The existence and widespread use of "vmstat", and its display of system-wide 
count of "context switches" (and administrator's reliance on judging a 
workload based on those counts) is i think ample proof that it makes sense 
to have those counters on a per CPU basis too.

So how about fixing these sw counts to properly work as percpu counters too? 
Or am i misssing something subtle that makes that impossible?

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/