[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090809024235.GA10146@nowhere>
Date: Sun, 9 Aug 2009 04:42:37 +0200
From: Frederic Weisbecker <fweisbec@...il.com>
To: Ingo Molnar <mingo@...e.hu>
Cc: mingo@...hat.com, hpa@...or.com, acme@...hat.com,
linux-kernel@...r.kernel.org, peterz@...radead.org, efault@....de,
tglx@...utronix.de, linux-tip-commits@...r.kernel.org
Subject: Re: [tip:perfcounters/core] perf tools: callchain: Fix sum of
percentages to be 100% by displaying amount of ignored chains in
fractal mode
On Sat, Aug 08, 2009 at 02:29:23PM +0200, Ingo Molnar wrote:
>
> * tip-bot for Frederic Weisbecker <fweisbec@...il.com> wrote:
>
> > Commit-ID: 84fce09c6b21ef675de079a724c13a205c89520b
> > Gitweb: http://git.kernel.org/tip/84fce09c6b21ef675de079a724c13a205c89520b
> > Author: Frederic Weisbecker <fweisbec@...il.com>
> > AuthorDate: Sat, 8 Aug 2009 02:16:25 +0200
> > Committer: Ingo Molnar <mingo@...e.hu>
> > CommitDate: Sat, 8 Aug 2009 13:49:21 +0200
> >
> > perf tools: callchain: Fix sum of percentages to be 100% by displaying amount of ignored chains in fractal mode
> >
> > When we filter the callchains below a given percentage, we
> > ignore them and the end result only shows entries that have an
> > upper percentage than the filter threshold.
> >
> > It seems to users then that we have an imbalance in the
> > percentage, as if the sum inside a profiled branch doesn't
> > reach 100%.
> >
> > Since in the past there have been real perf report bugs that
> > showed the same sypmtom, it would be nice to assure the user
> > that the data is perfect and trustable and it all sums up to
> > 100.00%.
> >
> > So fix this by displaying the remaining hits that have been
> > filtered but without more detail than their amount in each
> > branches. Example while filtering below 50%:
> >
> > 7.73% [k] delay_tsc
> > |
> > |--98.22%-- __const_udelay
> > | |
> > | |--86.37%-- ath5k_hw_register_timeout
> > | | ath5k_hw_noise_floor_calibration
> > | | ath5k_hw_reset
> > | | ath5k_reset
> > | | ath5k_config
> > | | ieee80211_hw_config
> > | | |
> > | | |--88.53%-- ieee80211_scan_work
> > | | | worker_thread
> > | | | kthread
> > | | | child_rip
> > | | --11.47%-- [...]
> > | --13.63%-- [...]
> > --1.78%-- [...]
>
> i dont seem to be able to get the bracketed numbers. I only get:
>
> titan:~> perf record -f -g -a ./hackbench 10
> titan:~> perf report -g fractal,10.0
>
> # Samples: 26832
> #
> # Overhead Command Shared Object Symbol
> # ........ .............. ........................ ......
> #
> 8.69% hackbench [kernel] [k] copy_user_generic_string
>
> 6.84% hackbench [kernel] [k] unix_stream_recvmsg
>
> 6.07% hackbench [kernel] [k] sock_alloc_send_pskb
>
> 4.33% hackbench [kernel] [k] _spin_lock
>
> 3.77% hackbench [kernel] [k] __kmalloc_node_track_caller
>
> 3.74% hackbench [kernel] [k] __alloc_skb
> |
> |--80.00%-- sock_alloc_send_pskb
> | sock_alloc_send_skb
> | unix_stream_sendmsg
> | __sock_sendmsg
> | sock_aio_write
> | do_sync_write
> | vfs_write
> | sys_write
> | sysenter_dispatch
> | 0xf7ffa430
> | 0xffadeb0000000014
> |
> --20.00%-- sock_alloc_send_skb
> unix_stream_sendmsg
> __sock_sendmsg
> sock_aio_write
> do_sync_write
> vfs_write
> sys_write
> sysenter_dispatch
> 0xf7ffa430
> 0xffadeb0000000014
>
> 3.53% hackbench [kernel] [k] ia32_sysenter_target
>
> 2.92% hackbench [vdso] [.] 0x000000f7ffa430
> |
> |--66.67%-- 0xf7ffa430
> | 0xffadeb0000000014
> |
> --33.33%-- 0xf7ffa431
> 0xffadeb0000000014
>
> 2.56% hackbench [kernel] [k] virt_to_head_page
>
> 2.37% hackbench [kernel] [k] kmem_cache_alloc_node
>
> 2.33% hackbench [kernel] [k] _spin_lock_irqsave
> |
> |--78.57%-- remove_wait_queue
> | poll_freewait
> | do_sys_poll
> | sys_poll
> | sysenter_dispatch
> | 0xf7ffa430
> | 0x1ffadea3c
> |
> |--7.14%-- __up_read
> | up_read
> | do_page_fault
> | page_fault
> | 0xf7ffa430
> | 0xa0df710000000a
> |
> |--7.14%-- unix_stream_sendmsg
> | __sock_sendmsg
> | sock_aio_write
> | do_sync_write
> | vfs_write
> | sys_write
> | sysenter_dispatch
> | 0xf7ffa430
> | 0xffadeb0000000014
> |
> --7.14%-- skb_queue_tail
> unix_stream_sendmsg
> __sock_sendmsg
> sock_aio_write
> do_sync_write
> vfs_write
> sys_write
> sysenter_dispatch
> 0xf7ffa430
> 0xffadeb0000000014
>
>
> I dont see the brackets anymore, nor does the 10% limit seem to have
> been listened to for the 7.14% result above, right?
>
> Ingo
Oh damn. I've been investigating for several hours in the wrong direction
just because this bug triggered only when I wasn't using -s s
Actually, it's just that the other kind of sorting make this bug occur
much more quickly: in the first results. It was then more obvious in these
cases, but the issue was present in every kind of sorting.
Please find the fix below, thanks!
Frederic.
---
>From 52c89279744db58a68fa2339b3c8038e67113fca Mon Sep 17 00:00:00 2001
From: Frederic Weisbecker <fweisbec@...il.com>
Date: Sun, 9 Aug 2009 04:19:15 +0200
Subject: [PATCH] perf tools: callchain: Fix bad rounding of minimum rate
Sometimes we get callchain branches that have a rate under the limit
given by the user.
Say you launched:
perf record -f -g -a ./hackbench 10
perf report -g fractal,10.0
And you get:
2.33% hackbench [kernel] [k] _spin_lock_irqsave
|
|--78.57%-- remove_wait_queue
| poll_freewait
| do_sys_poll
| sys_poll
| sysenter_dispatch
| 0xf7ffa430
| 0x1ffadea3c
|
|--7.14%-- __up_read
| up_read
| do_page_fault
| page_fault
| 0xf7ffa430
| 0xa0df710000000a
...
It is abnormal to get a 7.14% branch whereas we passed a 10% filter.
The problem is that we round down the minimum threshold.
This happens mostly when we have very low number of events.
If the total amount of your branch is 4 and you have a subranch of 3
events, filtering to 90% will be computed like follows:
limit = 4 * 0.9;
The result is about 3.6, but the cast to integer will round down to 3.
It means that our filter is actually of 75%
We must then explicitly round up the minimum threshold.
Reported-by: Ingo Molnar <mingo@...e.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@...il.com>
---
tools/perf/util/callchain.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index a8e67aa..0114734 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -13,6 +13,7 @@
#include <stdio.h>
#include <stdbool.h>
#include <errno.h>
+#include <math.h>
#include "callchain.h"
@@ -112,7 +113,7 @@ static void __sort_chain_graph_rel(struct callchain_node *node,
u64 min_hit;
node->rb_root = RB_ROOT;
- min_hit = node->children_hit * min_percent / 100.0;
+ min_hit = ceil(node->children_hit * min_percent);
chain_for_each_child(child, node) {
__sort_chain_graph_rel(child, min_percent);
@@ -126,7 +127,7 @@ static void
sort_chain_graph_rel(struct rb_root *rb_root, struct callchain_node *chain_root,
u64 min_hit __used, struct callchain_param *param)
{
- __sort_chain_graph_rel(chain_root, param->min_percent);
+ __sort_chain_graph_rel(chain_root, param->min_percent / 100.0);
rb_root->rb_node = chain_root->rb_root.rb_node;
}
--
1.6.2.3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists