lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1314880817.11566.19.camel@twins>
Date:	Thu, 01 Sep 2011 14:40:17 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Mike Hommey <mh@...ndium.org>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: Problem with perf hardware counters grouping

On Thu, 2011-09-01 at 13:59 +0200, Mike Hommey wrote:

> > I'm guessing you're running on something x86, either AMD-Fam10-12 or
> > Intel-NHM+.
> 
> Core2Duo

Ah, ok, then you're also using the fixed purpose thingies.

> > What happens with your >3 case is that while the group is valid and
> > could fit on the PMU, it won't fit at runtime because the NMI watchdog
> > is taking one and won't budge (cpu-pinned counter have precedence over
> > any other kind), effectively starving your group of pmu runtime.
> 
> That makes sense. But how exactly is not using groups different, then?
> perf, for instance doesn't use groups, and can get all the hardware
> counters.

The purpose of groups is to co-schedule events on the PMU, that is we
mandate that all members of the group are configured at the same time.
Note that this does not imply the group is scheduled at all times
(although you could request that by setting the perf_event_attr::pinned
on the leader).

By not using groups but individual counters we do not have this
restriction and perf will schedule them individually.

Now perf with rotate events when there are more than can physically fit
on the PMU at any one time, including groups. This can create the
appearance that all 4 are in fact working.

# perf stat -e instructions  ~/loop_ld

 Performance counter stats for '/root/loop_ld':

       400,765,771 instructions              #    0.00  insns per cycle        

       0.085995705 seconds time elapsed

# perf stat -e instructions -e instructions -e instructions -e instructions -e instructions -e instructions ~/loop_1b_ld

 Performance counter stats for '/root/loop_1b_ld':

       398,136,503 instructions              #    0.00  insns per cycle         [83.45%]
       400,387,443 instructions              #    0.00  insns per cycle         [83.62%]
       400,076,744 instructions              #    0.00  insns per cycle         [83.60%]
       400,221,739 instructions              #    0.00  insns per cycle         [83.62%]
       400,038,563 instructions              #    0.00  insns per cycle         [83.60%]
       402,085,668 instructions              #    0.00  insns per cycle         [82.94%]

       0.085712325 seconds time elapsed


This is on a wsm (4 gp + 1 fp counter capable of counting insn) with NMI
disabled.

Note the [83%] thing, that indicates these things got over committed and
we had to rotate the counters. In particular it is the ration between
PERF_FORMAT_TOTAL_TIME_ENABLED and PERF_FORMAT_TOTAL_TIME_RUNNING and we
use that to scale up the count.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ