lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Fri, 10 Jul 2015 11:59:51 -0700
From:	Stephane Eranian <eranian@...gle.com>
To:	Ingo Molnar <mingo@...nel.org>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Andy Lutomirski <luto@...capital.net>,
	Vince Weaver <vincent.weaver@...ne.edu>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...ux.intel.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Jiri Olsa <jolsa@...hat.com>, Borislav Petkov <bp@...e.de>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Andi Kleen <ak@...ux.intel.com>
Subject: Re: [RFC PATCH] perf: Provide status of known PMUs

Hi,

On Fri, Jul 10, 2015 at 1:35 AM, Ingo Molnar <mingo@...nel.org> wrote:
>
> * Peter Zijlstra <peterz@...radead.org> wrote:
>
>> On Thu, Jul 09, 2015 at 02:32:05PM +0200, Ingo Molnar wrote:
>> >
>> >    perf record error: The 'bts' PMU is not available, because the CPU does not support it
>>
>> This one makes sense.
>>
>> >    perf record error: The 'bts' PMU is not available, because this architecture does not support it
>> >    perf record error: The 'bts' PMU is not available, because its driver is not built into the kernel
>> >
>> > Because if it's the wrong architecture or CPU, I look for a box with the right
>> > one, if it's simply the kernel not having the necessary PMU driver then I'll boot
>> > a kernel with it enabled.
>>
>> These not so much; why won't a generic: "Unknown PMU, check arch/kernel" do?
>
> Yeah, I mean why not make the user's job harder if we can? We really don't want to
> solve this problem technically and we _really_ want tooling to be fundamentally
> unhelpful, right? ;-)
>
> I realize that the 'Error: there was a bug, aborting' style of sado-masochistic
> error messages are the current Linux tooling status quo, which opaque error
> feedback comes from an early technological mistake of Unix system calls screwing
> up error handling, and I also see that after decades of abuse people are showing
> signs of the Stockholm Syndrome related to this problem, but it _really_ does not
> have to be so ...
>
> Whenever we can we should change such bad patterns.
>
>> The thing is, I hate that hard-coded list, its pain I don't need.
>
> Absolutely! I pointed this out during review as well.
>
> It does not impact the core concept though: we should have a single numeric error,
> and free form error strings provided by the place that first triggers some
> problem. That should be both programmatically easy to handle and maximally
> informative to the users.
>
> At least half of a tool's usability comes not from how it behaves when it works,
> but how it behaves when it does not. (SystemD, I'm looking at you.)
>
This patch looks useful but it does not address a related issue. Here
you are reporting
on the status of specific PMU support, i.e., PMU is not supported by hardware.
But there is another problem which I ran into on ARM very often (like
on Tegra) and it
really annoys me.  The PMU hardware is present, but the instance of
the PMU on a CPU
is not present, simply because the CPU is hotpluggable and its offline
at the time the
tool (perf) starts. I am not talking about explicit hotplugging by the
user but instead be
the kernel. Then  during the run, the CPU is plugged back in by the
kernel to handle the
load. Perf misses monitoring that CPU completely, thus it does not
measure what's going
on in reality.

I understand that reporting that a PMU instance is supported but
offline does not
solve the entire problem. There needs to be some other kernel support.
But I think
it would be good to have the tool at least issue a warning saying:
"some CPUs are
offline, not monitoring all CPUs, results may be partial".
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ