linux-kernel - Re: [RFC PATCH] perf_core: provide a kernel-internal interface to get to performance counters

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20091001085330.GC15345@elte.hu>
Date:	Thu, 1 Oct 2009 10:53:30 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	"K.Prasad" <prasad@...ux.vnet.ibm.com>
Cc:	Arjan van de Ven <arjan@...radead.org>,
	"Frank Ch. Eigler" <fche@...hat.com>, peterz@...radead.org,
	linux-kernel@...r.kernel.org,
	Frederic Weisbecker <fweisbec@...il.com>
Subject: Re: [RFC PATCH] perf_core: provide a kernel-internal interface to
	get to performance counters


* K.Prasad <prasad@...ux.vnet.ibm.com> wrote:

> On Thu, Oct 01, 2009 at 09:25:18AM +0200, Ingo Molnar wrote:
> > 
> > * Arjan van de Ven <arjan@...radead.org> wrote:
> > 
> > > On Sun, 27 Sep 2009 00:02:46 +0530
> > > "K.Prasad" <prasad@...ux.vnet.ibm.com> wrote:
> > > 
> > > > On Sat, Sep 26, 2009 at 12:03:28PM -0400, Frank Ch. Eigler wrote:
> > > 
> > > > > For what it's worth, this sort of thing also looks useful from 
> > > > > systemtap's point of view.
> > > > 
> > > > Wouldn't SystemTap be another user that desires support for 
> > > > multiple/all CPU perf-counters (apart from hw-breakpoints as a 
> > > > potential user)? As Arjan pointed out, perf's present design would 
> > > > support only a per-CPU or per-task counter; not both.
> > > 
> > > I'm sorry but I think I am missing your point. "all cpu counters" 
> > > would be one small helper wrapper away, a helper I'm sure the 
> > > SystemTap people are happy to submit as part of their patch series 
> > > when they submit SystemTap to the kernel.
> > 
> > Yes, and Frederic wrote that wrapper already for the hw-breakpoints 
> > patches. It's a non-issue and does not affect the design - we can always 
> > gang up an array of per cpu perf events, it's a straightforward use of 
> > the existing design.
> > 
> 
> Such a design (iteratively invoking a per-CPU perf event for all 
> desired CPUs) isn't without issues, some of which are noted here: 
> (apart from http://lkml.org/lkml/2009/9/14/298).
> 
> - It breaks the abstraction that a user of the exported interfaces would
>   enjoy w.r.t. having all CPU (or a cpumask of CPU) breakpoints.

CPU offlining/onlining support would be interesting to add.

> - (Un)Availability of debug registers on every requested CPU is not
>   known until request for that CPU fails. A failed request should be 
>   followed by a rollback of the partially successful requests.

Yes.

> - Any breakpoint exceptions generated due to partially successful
>   requests (before a failed request is encountered) must be treated as 
>   'stray' and be ignored (by the end-user? or the wrapper code?).

Such inatomicity is inherent in using more than one CPU and a disjoint 
set of hw-breakpoints. If the calling code cares then callbacks 
triggering while the registration has not returned yet can be ignored.

> - Any CPUs that become online eventually have to be trapped and
>   populated with the appropriate debug register value (not something 
>   that the end-user of breakpoints should be bothered with).
> 
> - Modifying the characteristics of a kernel breakpoint (including the
>   valid CPUs) will be equally painful.
> 
> - Races between the requests (also leading to temporary failure of
>   all CPU requests) presenting an unclear picture about free debug
>   registers (making it difficult to predict the need for a retry).
> 
> So we either have a perf event infrastructure that is cognisant of 
> many/all CPU counters, or make perf as a user of hw-breakpoints layer 
> which already handles such requests in a deft manner (through 
> appropriate book-keeping).

Given that these are all still in the add-on category not affecting the 
design, while the problems solved by perf events are definitely in the 
non-trivial category, i'd suggest you extend perf events with a 'system 
wide' event abstraction, which:

 - Enumerates such registered events (via a list)

 - Adds a CPU hotplug handler (which clones those events over to a new
   CPU and directs it back to the ring-buffer of the existing event(s)
   [if any])

 - Plus a state field that allows the filtering out of stray/premature
   events.

Such an add-on layer/abstraction would sure be useful in other cases as 
well. It might make sense to expose it to user-space and make perf top 
use it by default.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/