lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 15 Jan 2021 16:13:24 -0800
From:   Nick Desaulniers <nick.desaulniers@...il.com>
To:     ndesaulniers@...gle.com
Cc:     akpm@...ux-foundation.org, clang-built-linux@...glegroups.com,
        corbet@....net, linux-doc@...r.kernel.org,
        linux-kbuild@...r.kernel.org, linux-kernel@...r.kernel.org,
        masahiroy@...nel.org, morbo@...gle.com, natechancellor@...il.com,
        samitolvanen@...gle.com, torvalds@...ux-foundation.org
Subject: Re: [PATCH v4] pgo: add clang's Profile Guided Optimization infrastructure

> On Wed, Jan 13, 2021 at 8:07 PM Nick Desaulniers
> <ndesaulniers@...gle.com> wrote:
> >
> > On Wed, Jan 13, 2021 at 12:55 PM Nathan Chancellor
> > <natechancellor@...il.com> wrote:
> > >
> > > However, I see an issue with actually using the data:
> > >
> > > $ sudo -s
> > > # mount -t debugfs none /sys/kernel/debug
> > > # cp -a /sys/kernel/debug/pgo/profraw vmlinux.profraw
> > > # chown nathan:nathan vmlinux.profraw
> > > # exit
> > > $ tc-build/build/llvm/stage1/bin/llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
> > > warning: vmlinux.profraw: Invalid instrumentation profile data (bad magic)
> > > error: No profiles could be merged.
> > >
> > > Am I holding it wrong? :) Note, this is virtualized, I do not have any
> > > "real" x86 hardware that I can afford to test on right now.
> >
> > Same.
> >
> > I think the magic calculation in this patch may differ from upstream
> > llvm: https://github.com/llvm/llvm-project/blob/49142991a685bd427d7e877c29c77371dfb7634c/llvm/include/llvm/ProfileData/SampleProf.h#L96-L101
> 
> Err...it looks like it was the padding calculation.  With that fixed
> up, we can query the profile data to get insights on the most heavily
> called functions.  Here's what my top 20 are (reset, then watch 10
> minutes worth of cat videos on youtube while running `find /` and
> sleeping at my desk).  Anything curious stand out to anyone?

Hello world from my personal laptop whose kernel was rebuilt with
profiling data!  Wow, I can run `find /` and watch cat videos on youtube
so fast!  Users will love this! /s

Checking the sections sizes of .text.hot. and .text.unlikely. looks
good!

> 
> $ llvm-profdata show -topn=20 /tmp/vmlinux.profraw
> Instrumentation level: IR  entry_first = 0
> Total functions: 48970
> Maximum function count: 62070879
> Maximum internal block count: 83221158
> Top 20 functions with the largest internal block counts:
>   drivers/tty/n_tty.c:n_tty_write, max count = 83221158
>   rcu_read_unlock_strict, max count = 62070879
>   _cond_resched, max count = 25486882
>   rcu_all_qs, max count = 25451477
>   drivers/cpuidle/poll_state.c:poll_idle, max count = 23618576
>   _raw_spin_unlock_irqrestore, max count = 18874121
>   drivers/cpuidle/governors/menu.c:menu_select, max count = 18721624
>   _raw_spin_lock_irqsave, max count = 18509161
>   memchr, max count = 15525452
>   _raw_spin_lock, max count = 15484254
>   __mod_memcg_state, max count = 14604619
>   __mod_memcg_lruvec_state, max count = 14602783
>   fs/ext4/hash.c:str2hashbuf_signed, max count = 14098424
>   __mod_lruvec_state, max count = 12527154
>   __mod_node_page_state, max count = 12525172
>   native_sched_clock, max count = 8904692
>   sched_clock_cpu, max count = 8895832
>   sched_clock, max count = 8894627
>   kernel/entry/common.c:exit_to_user_mode_prepare, max count = 8289031
>   fpregs_assert_state_consistent, max count = 8287198
> 
> -- 
> Thanks,
> ~Nick Desaulniers
> 

Powered by blists - more mailing lists