[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1327647011.6335.22.camel@marge.simson.net>
Date: Fri, 27 Jan 2012 07:50:11 +0100
From: Mike Galbraith <efault@....de>
To: Lénaïc Huard <lenaic@...ard.fr.eu.org>
Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Paul Mackerras <paulus@...ba.org>, Ingo Molnar <mingo@...e.hu>,
Arnaldo Carvalho de Melo <acme@...stprotocols.net>,
linux-kernel@...r.kernel.org
Subject: Re: Shift by one instruction in the perf annotate output
On Fri, 2012-01-27 at 00:01 +0100, Lénaïc Huard wrote:
> Hello,
>
> Using perf and zooming up to the annotated assembler, I noticed a “shift by
> one” issue between the assembler instructions and the reported consumed cycles
> in the left margin.
> Each time a given instruction is expensive, the high event count appears on
> the line right after.
> In order to validate that feeling, I tried to put an expensive instruction
> (division) inside a block of cheaper instructions (no operation). And here is
> the output of perf annotate:
>
> 0,00 : 40041d: nop
> 0,00 : 40041e: nop
> 5,31 : 40041f: nop
> 0,00 : 400420: divl (%rsp)
> 84,18 : 400423: nop
> 0,00 : 400424: nop
> 0,07 : 400425: nop
>
> I would have expected the 84,18 count to be in front of the “divl” instruction
> and not in front of the “nop” one.
>
> I am running Linux and perf 3.2 but I remember that previous versions suffered
> from the same issue.
Yeah, it's been that way since day 1 here, you have to squint a little.
It's not always off, it gets lock btr right, but lock xadd wrong. Same
CPU as yours. I see the same with cache-misses too, but since the right
instruction is right next door, it's always been easy to spot the real
culprit, so I just squint and ignore it, figuring it's a hw detail.
Showing cycles for __schedule
Events Pcnt (>=5%)
Percent | Source code & Disassembly of vmlinux
------------------------------------------------
: next = pick_next_task(rq);
: clear_tsk_need_resched(prev);
:
: if (likely(prev != next)) {
7.68 : ffffffff8139efca: mov -0xf0(%rbp),%rcx
:
: static inline void load_cr3(pgd_t *pgdir)
: {
: write_cr3(__pa(pgdir));
6.87 : ffffffff8139f061: mov 0x48(%r15),%rdi
: : CONST_MASK_ADDR(nr, addr)
: : "iq" ((u8)~CONST_MASK(nr)));
: } else {
: asm volatile(LOCK_PREFIX "btr %1,%0"
38.27 : ffffffff8139f06d: lock btr %r14d,0x300(%r12)
: /*
: * load the LDT, if the LDT is different:
: */
: if (unlikely(prev->context.ldt != next->context.ldt))
12.53 : ffffffff8139f077: mov 0x308(%r15),%r13
0.40 : ffffffff8139f0df: jne ffffffff813a7c60 <ret_from_fork>
0.00 : ffffffff8139f0e5: mov %rbp,%rsi
0.00 : ffffffff8139f0e8: pop %rbp
0.00 : ffffffff8139f0e9: popfq
5.12 : ffffffff8139f0ea: mov %rax,-0xe8(%rbp)
Showing cycles for _raw_spin_lock_irqsave
Events Pcnt (>=5%)
Percent | Source code & Disassembly of vmlinux
------------------------------------------------
: * documented behavior of the "pop" instruction.
: */
: asm volatile("# __raw_save_flags\n\t"
0.00 : ffffffff813a0b1f: pushfq
15.38 : ffffffff813a0b20: pop %r12
: unsigned long flags;
:
: local_irq_save(flags);
: preempt_disable();
16.24 : ffffffff813a0b23: mov $0x1,%edi
: register struct __raw_tickets inc = { .tail = 1 };
:
: inc = xadd(&lock->tickets, inc);
0.00 : ffffffff813a0b32: lock xadd %ax,(%rbx)
46.15 : ffffffff813a0b37: movzbl %ah,%edx
: return __raw_spin_lock_irqsave(lock);
: }
0.85 : ffffffff813a0b49: mov %r12,%rax
0.00 : ffffffff813a0b4c: pop %rbx
17.09 : ffffffff813a0b4d: pop %r12
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists