lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 24 Jun 2009 22:12:03 -0400 (EDT)
From:	Vince Weaver <vince@...ter.net>
To:	Ingo Molnar <mingo@...e.hu>
cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Paul Mackerras <paulus@...ba.org>, linux-kernel@...r.kernel.org
Subject: Re: performance counter 20% error finding retired instruction count

On Wed, 24 Jun 2009, Ingo Molnar wrote:
> * Vince Weaver <vince@...ter.net> wrote:
>
> Those ~2100 instructions are executed by your app: as the ELF
> dynamic loader starts up your test-app.
>
> If you have some tool that reports less than that then that tool is
> not being truthful about the true overhead of your application.

I wanted the instruction count of the application, not the loader.
If I wanted the overhead of the loader too, then I would have specified 
it.  I don't think it has anything to do with tools being "less than 
truthful".  I notice perf doesn't seem to include its own overheads into 
the count.

> Also note that applications that only execute 1 million instructions
> are very, very rare - a modern CPU can execute billions of
> instructions, per second, per core.

Yes, I know that.

As I hope you know, the chip designers offer no guarantees with any of the 
performance counters.  So before you can use them, you have to validate 
them a bit to make sure they are returning expected results.  Hence the 
need for microbenchmarks, one of which I used as an example.

You have to be careful with performance counters.  For example, on Pentium 
4, the retired instruction counter will have as much as 2% error on some 
of the spec2k benchmarks because the "fldcw" instruction counts as two 
instructions instead of one.

This kind of difference is important when doing validation work, and can't 
just be swept under the rug with "if you use bigger programs it doesn't 
matter".

It's also nice to be able to skip the loader overhead, as the loader can 
change from system to system and makes it hard to compare counters across 
various machines.  Though it sounds like the perf utility isn't going to 
be supporting this anytime soon.

Vince
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ