lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7c86c4470812072318n33e27045k50490f180ecfd8c0@mail.gmail.com>
Date:	Mon, 8 Dec 2008 08:18:51 +0100
From:	"stephane eranian" <eranian@...glemail.com>
To:	"Paul Mackerras" <paulus@...ba.org>
Cc:	"Peter Zijlstra" <a.p.zijlstra@...llo.nl>,
	"Ingo Molnar" <mingo@...e.hu>,
	"Thomas Gleixner" <tglx@...utronix.de>,
	LKML <linux-kernel@...r.kernel.org>, linux-arch@...r.kernel.org,
	"Andrew Morton" <akpm@...ux-foundation.org>,
	"Eric Dumazet" <dada1@...mosbay.com>,
	"Robert Richter" <robert.richter@....com>,
	"Arjan van de Veen" <arjan@...radead.org>,
	"Peter Anvin" <hpa@...or.com>,
	"Steven Rostedt" <rostedt@...dmis.org>,
	"David Miller" <davem@...emloft.net>
Subject: Re: [patch 0/3] [Announcement] Performance Counters for Linux

Hi,

On Sun, Dec 7, 2008 at 6:15 AM, Paul Mackerras <paulus@...ba.org> wrote:
> Peter Zijlstra writes:
>
>> On Sat, 2008-12-06 at 11:05 +1100, Paul Mackerras wrote:
>> > Now, the tables in perfmon's user-land libpfm that describe the
>> > mapping from abstract events to event-selector values and the
>> > constraints on what events can be counted together come to nearly
>> > 29,000 lines of code just for the IBM 64-bit powerpc processors.
>> >
>> > Your API condemns us to adding all that bloat to the kernel, plus the
>> > code to use those tables.
>>
>> Since you need those tables and that code anyway, and in a solid
>> reliable way, what is the objection of carrying it in the kernel?
>
> Because it's about 320kB of unpageable kernel memory, and it doesn't
> need to be in the kernel.
>

That inevitably pulls in large amounts of data, the event table for each PMU
model and the description of the constraints between events. New processors
have hundreds of events. Moreover, there is the complexity of the assignment
algorithm to map the events to counters such that they actually measure what
you've asked for. I described some of those constraints in my previous message.
They are not trivial and are oftentimes multi-dimensional. Getting the
algorithms
right is difficult. Event tables are also oftentimes incomplete or
bogus when first
published by HW vendors.

It does not make sense to have this kind of data + code in the kernel. It would
make developing them much more difficult. Maintenance would also be more
difficult. And clearly you don't want to have to re-run the assignment routine
each time you context switch.

The kernel is not the only place for rock-solid code. You can have solid/stable
code in libraries as well.

> The fundamental problem with Ingo and Thomas's proposal is that the
> abstraction is at the wrong level.  It makes individual counters the
> central idea, when the central idea should be a set of counters that
> all start and stop counting at the same times.  People doing
> performance analysis want to be able to compare counts of different
> events and get ratios, and you can't do that meaningfully if the
> counts correspond to different stretches of code.
>
> Once you make the abstraction a set of counters, then you also make it
> possible to have a counter-set that is the whole PMU.  Then you don't
> have to have the kernel knowing all the possible settings for the PMU;
> it only needs to know the simple ones, and if you want to do something
> more sophisticated, you can have userspace specifying the bits to
> select the more sophisticated setting.
>
>> Furthermore, is there a good technical reason these cpus are so
>> complicated to use?
>
> That question is a bit ambiguous.  If you mean, why did the hardware
> designers make it so complex? then I don't really know, but it doesn't
> matter because the CPU hardware is what it is.  At best I might be
> able to influence future designs to be a bit simpler.
>

Let me explain the HW complexity a bit. It's all a matter of tradeoffs.
I have regular discussions with the PMU design architects about this.
If you talk to them, then you understand the environment they have to
live in and you understand why those constraints are there. The key point
to understand is that the PMU is never critical to the chip. The chip can work
well without. The real-estate on the chip is always very tight. PMU is a 2nd
class citizen, thus low in the priority list. For certain PMU features
the tradeoff
is: do you want the feature with constraints on programming or no feature at
all. The common HW limitation is wires. For instance, I was once told: would you
rather have a PMU cache event that can be programmed on any counters but
with an increased cache latency for all accesses or a faster cache and
a constraint
on the event? The response is obvious.

I think you now understand why there are constraints and also why they
will never
go away, quite the contrary. I'd rather have a PMU with constraints than no PMU.
Hardware designers make a lot of efforts to give us what we have today already
and we should be thankful.

> If you mean, could the software description of the hardware be
> simpler? then maybe - I'm just reading up on the details of the
> hardware, and it is pretty complex, with multiple layers of
> multiplexers and event buses.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ