linux-kernel - Re: [numbers] perfmon/pfmon overhead of 17%-94%

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 3 Jul 2009 09:58:22 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Vince Weaver <vince@...ter.net>
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Paul Mackerras <paulus@...ba.org>,
	linux-kernel@...r.kernel.org, Mike Galbraith <efault@....de>
Subject: Re: [numbers] perfmon/pfmon overhead of 17%-94%

* Vince Weaver <vince@...ter.net> wrote:

> On Mon, 29 Jun 2009, Ingo Molnar wrote:
>>
>> * Vince Weaver <vince@...ter.net> wrote:
>>
>>>> If the 5 thousand cycles measurement overhead _still_ matters to
>>>> you under such circumstances then by all means please submit the
>>>> patches to improve it. Despite your claims this is totally
>>>> fixable with the current perfcounters design, Peter outlined the
>>>> steps of how to solve it, you can utilize ptrace if you want to.
>>>
>>> Is it really "totally" fixible?  I don't just mean getting the
>>> overhead from ~3000 down to ~100, I mean down to zero.
>>
>> The thing is, not even pfmon gets it down to zero:
>>
>>  pfmon -e INSTRUCTIONS_RETIRED --follow-fork --aggregate-results ~/million
>>  1000001 INSTRUCTIONS_RETIRED
>>
>> So ... do you take the hardliner purist view and consider it crap
>> due to that imprecision, or do you take the pragmatist view of also
>> considering the relative relevance of any imperfection? ;-)
>
> as I said in a previous post, on most x86 chips the 
> instructions_retired counter also includes any hardware interrupts 
> that occur during the process runtime.  So any clock interrupts, 
> etc, show up as an extra instruction.  So on the "million" 
> benchmark, it's usually +/- 2 extra instructions.

yeah. But it has nothing to do with the function you are measuring, 
right?

My general point is really that what matters is the statistical 
validity of the end result. I dont think you ever disagreed with 
that point - you just seem to have a lower noise acceptance 
threshold ;-)

> It looks like support might be added to perfcounters to track 
> these hardware interrupt stats per-process, which would be great, 
> as it's been really hard to quantify that currently.

Yeah. There's a patch-set in the works that attempts to do something 
in this area - see these mails on lkml:

    perf_counter: Add Generalized Hardware interrupt support

Right now they are just convenience wrappers around CPU model 
specific hw events - but we could extend the whole thing with 
software counters as well and isolate per IRQ vector events and 
counts, by adding a callback to do_IRQ().

That would give a mixture of hardware and software counter based IRQ 
instrumentation features that looks quite compelling. Any comments 
on what features/capabilities you'd like to see in this area?

> In any case, it looks like the changes to make perf have lower 
> overhead have been merged, which makes me happy.  Thank you.

You are welcome :)

Btw., perfcounters still has no support for older Intel CPUs such as 
P3's and P2's - and they have pretty sane PMUs - so if you have such 
a machine (which your perfmon contribution suggests you might 
have/had) and are interested it would be nice to get support for 
them. P4 support is interesting too but more challenging.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/