linux-kernel - Re: [tip:perfcounters/core] perf_counter: x86: Fix call-chain support to use NMI-safe methods

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090619152029.GA7204@elte.hu>
Date:	Fri, 19 Jun 2009 17:20:29 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>, mingo@...hat.com,
	hpa@...or.com, paulus@...ba.org, acme@...hat.com,
	linux-kernel@...r.kernel.org, a.p.zijlstra@...llo.nl,
	penberg@...helsinki.fi, vegard.nossum@...il.com, efault@....de,
	jeremy@...p.org, npiggin@...e.de, tglx@...utronix.de,
	linux-tip-commits@...r.kernel.org
Subject: Re: [tip:perfcounters/core] perf_counter: x86: Fix call-chain
	support to use NMI-safe methods


* Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> On Mon, 15 Jun 2009, Ingo Molnar wrote:
> > 
> > See the numbers in the other mail: about 33 million pagefaults 
> > happen in a typical kernel build - that's ~400K/sec - and that 
> > is not a particularly really pagefault-heavy workload.
> 
> Did you do any function-level profiles?
> 
> Last I looked at it, the real cost of page faults were all in the 
> memory copies and page clearing, and while it would be nice to 
> speed up the kernel entry and exit, the few tens of cycles we 
> might be able to get from there really aren't all that important.

Yeah.

Here's the function level profiles of a typical kernel build on a 
Nehalem box:

 $ perf report --sort symbol

 #
 # (14317328 samples)
 #
 # Overhead  Symbol
 # ........  ......
 #
    44.05%  0x000000001a0b80
     5.09%  0x0000000001d298
     3.56%  0x0000000005742c
     2.48%  0x0000000014026d
     2.31%  0x00000000007b1a
     2.06%  0x00000000115ac9
     1.83%  [.] _int_malloc
     1.71%  0x00000000064680
     1.50%  [.] memset
     1.37%  0x00000000125d88
     1.28%  0x000000000b7642
     1.17%  [k] clear_page_c
     0.87%  [k] page_fault
     0.78%  [.] is_defined_config
     0.71%  [.] _int_free
     0.68%  [.] __GI_strlen
     0.66%  0x000000000699e8
     0.54%  [.] __GI_memcpy

Most is dominated by user-space symbols. (no proper ELF+debuginfo on 
this box so they are unnamed.) It also sows that page clearing and 
pagefault handling dominates the kernel overhead - but is dwarved by 
other overhead. Any page-fault-entry costs are a drop in the bucket.

In fact with call-chain graphs we can get a precise picture, as we 
can do a non-linear 'slice' set operation over the samples and 
filter out the ones that have the 'page_fault' pattern in one of 
their parent functions:

 $ perf report --sort symbol --parent page_fault

 #
 # (14317328 samples)
 #
 # Overhead  Symbol
 # ........  ......
 #
     1.12%  [k] clear_page_c
     0.87%  [k] page_fault
     0.43%  [k] get_page_from_freelist
     0.25%  [k] _spin_lock
     0.24%  [k] do_page_fault
     0.23%  [k] perf_swcounter_ctx_event
     0.16%  [k] perf_swcounter_event
     0.15%  [k] handle_mm_fault
     0.15%  [k] __alloc_pages_nodemask
     0.14%  [k] __rmqueue
     0.12%  [k] find_get_page
     0.11%  [k] copy_page_c
     0.11%  [k] find_vma
     0.10%  [k] _spin_lock_irqsave
     0.10%  [k] __wake_up_bit
     0.09%  [k] _spin_unlock_irqrestore
     0.09%  [k] do_anonymous_page
     0.09%  [k] __inc_zone_state

This "sub-profile" shows the true summary overhead that 'page_fault' 
and all its child functions have. Note that for example clear_page_c 
decreased from 1.17% to 1.12%:

     1.12%  [k] clear_page_c
     1.17%  [k] clear_page_c

because there's 0.05% of other callers to clear_page_c() that do not 
involve page_fault. Those are filtered out via --parent 
filtering/matching.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/