lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1426145571-3065-1-git-send-email-namhyung@kernel.org>
Date:	Thu, 12 Mar 2015 16:32:45 +0900
From:	Namhyung Kim <namhyung@...nel.org>
To:	Arnaldo Carvalho de Melo <acme@...nel.org>
Cc:	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Jiri Olsa <jolsa@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>,
	David Ahern <dsahern@...il.com>,
	Minchan Kim <minchan@...nel.org>,
	Joonsoo Kim <js1304@...il.com>
Subject: [RFC/PATCHSET 0/6] perf kmem: Implement page allocation analysis (v1)

Hello,

Currently perf kmem command only analyzes SLAB memory allocation.  And
I'd like to introduce page allocation analysis also.  Users can use
 --slab and/or --page option to select it.  If none of these options
are used, it does slab allocation analysis for backward compatibility.

The patch 1-3 are bugfix and cleanups.  Patch 4 implements basic
support for page allocation analysis, patch 5 deals with the callsite
and finally patch 6 implements sorting.

In this patchset, I used two kmem events: kmem:mm_page_alloc and
kmem_page_free for analysis as they can track every memory
allocation/free path AFAIK.  However, unlike slab tracepoint events,
those page allocation events don't provide callsite info directly.  So
I recorded callchains and extracted callsites like below:

Normal page allocation callchains look like this:

  360a7e __alloc_pages_nodemask
  3a711c alloc_pages_current
  357bc7 __page_cache_alloc   <-- callsite
  357cf6 pagecache_get_page
   48b0a prepare_pages
   494d3 __btrfs_buffered_write
   49cdf btrfs_file_write_iter
  3ceb6e new_sync_write
  3cf447 vfs_write
  3cff99 sys_write
  7556e9 system_call
    f880 __write_nocancel
   33eb9 cmd_record
   4b38e cmd_kmem
   7aa23 run_builtin
   27a9a main
   20800 __libc_start_main

But first two are internal page allocation functions so it should be
skipped.  To determine such allocation functions, I used following regex:

  ^_?_?(alloc|get_free|get_zeroed)_pages?

This gave me a following list of functions (you can see this with -v):

  alloc func: __get_free_pages
  alloc func: get_zeroed_page
  alloc func: alloc_pages_exact
  alloc func: __alloc_pages_direct_compact
  alloc func: __alloc_pages_nodemask
  alloc func: alloc_page_interleave
  alloc func: alloc_pages_current
  alloc func: alloc_pages_vma
  alloc func: alloc_page_buffers
  alloc func: alloc_pages_exact_nid

After skipping those function, it got '__page_cache_alloc'.

Other information such as allocation order, migration type and gfp
flags are provided by tracepoint events.

Basically the output will be sorted by total allocation bytes, but you
can change it by using -s/--sort option.  The following sort keys are
added to support page analysis: page, order, mtype, gfp.  Existing
'callsite', 'bytes' and 'hit' sort keys also can be used.

An example follows:

  # perf kmem record --slab --page sleep 1
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Captured and wrote 49.277 MB perf.data (191027 samples) ]

  # perf kmem stat --page --caller -l 10 -s order,hit

  --------------------------------------------------------------------------------------------
   Total_alloc/Per | Hit      | Order | Migrate type | GFP flag | Callsite
  --------------------------------------------------------------------------------------------
       65536/16384 |        4 |     2 |  RECLAIMABLE | 00285250 | new_slab
    51347456/4096  |    12536 |     0 |      MOVABLE | 0102005a | __page_cache_alloc
       53248/4096  |       13 |     0 |    UNMOVABLE | 002084d0 | pte_alloc_one
       40960/4096  |       10 |     0 |      MOVABLE | 000280da | handle_mm_fault
       28672/4096  |        7 |     0 |    UNMOVABLE | 000000d0 | __pollwait
       20480/4096  |        5 |     0 |      MOVABLE | 000200da | do_wp_page
       20480/4096  |        5 |     0 |      MOVABLE | 000200da | do_cow_fault
       16384/4096  |        4 |     0 |    UNMOVABLE | 00000200 | __tlb_remove_page
       16384/4096  |        4 |     0 |    UNMOVABLE | 000084d0 | __pmd_alloc
        8192/4096  |        2 |     0 |    UNMOVABLE | 000084d0 | __pud_alloc
   ...             | ...      | ...   | ...          | ...      | ...
  --------------------------------------------------------------------------------------------

  SUMMARY (page allocator)
  ========================
  Total alloc requested: 12593
  Total alloc failure  : 0
  Total bytes allocated: 51630080
  Total free  requested: 115
  Total free  unmatched: 67
  Total bytes freed    : 471040
  
  Order     UNMOVABLE   RECLAIMABLE       MOVABLE      RESERVED   CMA/ISOLATE
  -----  ------------  ------------  ------------  ------------  ------------
      0            32             0         12557             0             0
      1             0             0             0             0             0
      2             0             4             0             0             0
      3             0             0             0             0             0
      4             0             0             0             0             0
      5             0             0             0             0             0
      6             0             0             0             0             0
      7             0             0             0             0             0
      8             0             0             0             0             0
      9             0             0             0             0             0
     10             0             0             0             0             0


I have some idea how to improve it.  But I'd also like to hear other
idea, suggestion, feedback and so on.

This is available at perf/kmem-page-v1 branch on my tree:

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung


Namhyung Kim (6):
  perf kmem: Fix segfault when invalid sort key is given
  perf kmem: Allow -v option
  perf kmem: Fix alignment of slab result table
  perf kmem: Analyze page allocator events also
  perf kmem: Implement stat --page --caller
  perf kmem: Support sort keys on page analysis

 tools/perf/Documentation/perf-kmem.txt |  18 +-
 tools/perf/builtin-kmem.c              | 961 ++++++++++++++++++++++++++++++---
 2 files changed, 915 insertions(+), 64 deletions(-)

-- 
2.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ