[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0906240937120.10620@pianoman.cluster.toy>
Date:	Wed, 24 Jun 2009 09:59:54 -0400 (EDT)
From:	Vince Weaver <vince@...ter.net>
To:	linux-kernel@...r.kernel.org
Subject: performance counter 20% error finding retired instruction count
Hello
As an aside, is it time to set up a dedicated Performance Counters
for Linux mailing list?   (Hereafter referred to as p10c7l to avoid
confusion with the other implementations that have already taken
all the good abbreviated forms of the concept).  If/when the 
infrastructure appears in a released kernel, there's going to be a lot of 
chatter by people who use performance counters and suddenly find they are 
stuck with a huge step backwards in functionality.  And asking Fortran 
programmers to provide kernel patches probably won't be a productive 
response.  But I digress.
I was trying to get an exact retired instruction count from p10c7l.
I am using the test million.s, available here
  ( http://www.csl.cornell.edu/~vince/projects/perf_counter/million.s )
It should count exactly one million instructions.
Tests with valgrind and qemu show that it does.
Using perfmon2 on Pentium Pro, PII, PIII, P4, Athlon32, and Phenom
all give the proper result:
tobler:~% pfmon -e retired_instructions ./million
1000002 RETIRED_INSTRUCTIONS
    ( it is 1,000,002 +/-2 because on most x86 architectures retired
      instruction count includes any hardware interrupts that might
      happen at the time.  It woud be a great feature if p10c7l
      could add some way of gathering the per-process hardware
      instruction count statistic to help quantify that).
Yet with perf on the same Athlon32 machine (using
kernel 2.6.30-03984-g45e3e19) gives:
tobler:~%perf stat ./million
  Performance counter stats for './million':
        1.519366  task-clock-ticks     #       0.835 CPU utilization factor
               3  context-switches     #       0.002 M/sec
               0  CPU-migrations       #       0.000 M/sec
              53  page-faults          #       0.035 M/sec
         2483822  cycles               #    1634.775 M/sec
         1240849  instructions         #     816.689 M/sec # 0.500 per cycle
          612685  cache-references     #     403.250 M/sec
            3564  cache-misses         #       2.346 M/sec
  Wall-clock time elapsed:     1.819226 msecs
Running multiple times gives:
    1240849
    1257312
    1242313
Which is a varying error of at least 20% which isn't even 
consistent.  Is this because of sampling?  The documentation doesn't 
really warn about this as far as I can tell.
Thanks for any help resolving this problem
Vince
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
