linux-kernel - Re: [PATCH v1 5/9] perf util: Remove a set of shadow stats static variables

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20171121180350.GJ28112@tassilo.jf.intel.com>
Date:   Tue, 21 Nov 2017 10:03:50 -0800
From:   Andi Kleen <ak@...ux.intel.com>
To:     Jiri Olsa <jolsa@...hat.com>
Cc:     Jin Yao <yao.jin@...ux.intel.com>, acme@...nel.org,
        jolsa@...nel.org, peterz@...radead.org, mingo@...hat.com,
        alexander.shishkin@...ux.intel.com, Linux-kernel@...r.kernel.org,
        kan.liang@...el.com, yao.jin@...el.com
Subject: Re: [PATCH v1 5/9] perf util: Remove a set of shadow stats static
 variables

> all this is about switching from array to rb_list for the --per-thread case,
> which can be considered as a special use case.. how much do we suffer in
> performance with new code? how about the "perf stat -I 100", would it scale
> ok for extreme cases (many events in -e or -dddd..)

rbtrees scale by log N, with N being the entries in the tree.

Even in extreme cases, let's say 10000 events and 1000 cpus it would
need only 8 memory accesses and comparisons for each look up.
Even if we assume cache misses for all of the memory lookups,
at ~200ns per cache miss it's still only 1us per event, which 
is negligible.

In practice not all memory accesses will be misses because
the upper levels of the tree are almost certainly cached
from earlier accesses.

-Andi