linux-kernel - Re: Oprofile : need to adjust PC by 16 bytes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7c86c4470811170702y127c9249m9b86b65a38a3e05c@mail.gmail.com>
Date:	Mon, 17 Nov 2008 16:02:19 +0100
From:	"stephane eranian" <eranian@...glemail.com>
To:	"Andi Kleen" <andi@...stfloor.org>
Cc:	"Eric Dumazet" <dada1@...mosbay.com>,
	"Mikael Pettersson" <mikpe@...uu.se>,
	"Robert Richter" <robert.richter@....com>,
	oprofile-list@...ts.sf.net, "Ingo Molnar" <mingo@...e.hu>,
	"Jiri Kosina" <jkosina@...e.cz>, "Jiri Benc" <jbenc@...e.cz>,
	"Vilem Marsik" <vmarsik@...e.cz>,
	"Pekka Enberg" <penberg@...helsinki.fi>,
	linux-kernel@...r.kernel.org
Subject: Re: Oprofile : need to adjust PC by 16 bytes

Hello,

I have not seen the beginning of that discussion so my comments may be
slightly off.
It seems Eric has problems with accuracy of instruction addresses when
sampling with the PMU.

This is an inherent limitation of the PMU. It can be mitigated but not
completely eliminated. The core
issue is that it takes several cycles between the moment a counter
overflows and posting of the PMU
interrupt. During that time, the CPU keeps on executing instructions.
The interrupt IP you get, reflects
the place you were when it triggered. That can be far away from where
it was posted and where the
counter actually overflowed. Of course, if you are stalled that
distance is usually 0 or off by a small
number of instructions. But it can be very large when overflow happens
during a kernel critical section
where interrupts are off. There is nothing SW can do about all of this.

Andi mentioned PEBS. I don't know if you are familiar with what it
does. Let me summarize. This is
a hardware/microcode feature which implements a hardware-managed
buffer where samples are
stored. The OS points the CPU to a memory region where PMU samples are
stored. No PMU
interrupt is generated until the buffer becomes full. That part
addresses some of the overhead
associated with interrupt-based sampling. Unfortunately, PEBS  does
not point to the instruction where
the counter overflowed, it will still be a few instructions off. But
this time, you get the machine state at the
last retired instruction. Furthermore, PEBS can record samples while
in kernel critical sections. A limitation
of PEBS is that it does not work with all the PMU events. Only a
handful are available.

As for perfmon, if you pull from the perfmon2 GIT tree, this should
work. Don't know what happen in
you case.

Perfmon and the pfmon can do simple counting or also collect profiles.

$ pfmon date

Counts cycles at the user level only for the process date

$ pfmon --system-wide -t10

Counts elapsed cycles at user level for all CPU for 10s. Results are per-cpu

$ pfmon --long-smpl-periods=240000 date

Collect a flat profile of process date. Period is 240,000 elapsed user cycle

$ pfmon --system-wide --long-smpl-periods=240000 -t 10

Collect a flat profile on each online CPU during 10s. Period is
240,000 user elapsed cycles. Results are per-cpu

You have a lot  more examples on the perfmon web site, Following the
documentation and pfmon users' guide.

Perfmon/pfmon can use PEBS on Intel Core processors. First step is to
insert the kernel module for it:
   # modprobe perfmon_pebs_core_smpl

Then use pfmon, we use instruction_retired because elapsed cycles does
not support PEBS:

   $ pfmon --smpl-module=pebs -einstructions_retired
--long-smpl-periods=120000 date

Hope this helps.

On Sat, Nov 15, 2008 at 7:36 PM, Andi Kleen <andi@...stfloor.org> wrote:
> On Sat, Nov 15, 2008 at 05:30:58PM +0100, Eric Dumazet wrote:
>> Andi Kleen a écrit :
>> >>>And no, blindly subtracting 16 from IP is not a fix.
>> >>Who mentioned a fix ? I am only giving more fuel to Intel guys so they
>> >>hopefully can give us a working oprofile.
>> >
>> >You would need to implement PEBS support to avoid that problem. But it's a
>> >big
>> >task. perfmon2 implements it already.
>> >
>>
>> Thanks for the information.
>>
>> Hum, so I grabbed perfmon2 git tree, installed various tools...
>>
>> I am quite new to pfmon and tried :
>>
>> # pfmon --system-wide
>> sizeof=64 44
>> <press ENTER to stop session>
>>
>> Then started "tbench 8", and got a kernel panic after 6 seconds.
>>
>>
>> I was using oprofile like this
>>
>> opcontrol --vmlinux=/path/vmlinux --start
>> // doing some benchmarking...
>> opreport -l vmlinux | head -n 40
>>
>>
>> What would be a working equivalent for perfmon2 based tools ?
>
> Probably getting a perfmon tree that works. I guess Stephane
> can help (cc'ed).  Or just deal with imprecise events for now.
>
> -Andi
> --
> ak@...ux.intel.com
>