lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 23 May 2018 16:35:46 -0300
From:   Arnaldo Carvalho de Melo <acme@...nel.org>
To:     Adrian Hunter <adrian.hunter@...el.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Andy Lutomirski <luto@...nel.org>,
        "H. Peter Anvin" <hpa@...or.com>, Andi Kleen <ak@...ux.intel.com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Joerg Roedel <joro@...tes.org>, Jiri Olsa <jolsa@...hat.com>,
        linux-kernel@...r.kernel.org, x86@...nel.org
Subject: Re: [PATCH V3 00/17] perf tools and x86 PTI entry trampolines

Em Tue, May 22, 2018 at 01:54:28PM +0300, Adrian Hunter escreveu:
> Original Cover email:
> 
> Perf tools do not know about x86 PTI entry trampolines - see example
> below.  These patches add a workaround, namely "perf tools: Workaround
> missing maps for x86 PTI entry trampolines", which has the limitation
> that it hard codes the addresses.  Note that the workaround will work for
> old kernels and old perf.data files, but not for future kernels if the
> trampoline addresses are ever changed.
> 
> At present, perf tools uses /proc/kallsyms to construct a memory map for
> the kernel.  Recording such a map in the perf.data file is necessary to
> deal with kernel relocation and KASLR.
> 
> While it is reasonable on its own terms, to add symbols for the trampolines
> to /proc/kallsyms, the motivation here is to have perf tools use them to
> create memory maps in the same fashion as is done for the kernel text.
> 
> So the first 2 patches add symbols to /proc/kallsyms for the trampolines:
> 
>       kallsyms: Simplify update_iter_mod()
>       kallsyms, x86: Export addresses of syscall trampolines
> 
> perf tools have the ability to use /proc/kcore (in conjunction with
> /proc/kallsyms) as the kernel image. So the next 2 patches add program
> headers for the trampolines to the kcore ELF:
> 
>       x86: Add entry trampolines to kcore
>       x86: kcore: Give entry trampolines all the same offset in kcore
> 
> It is worth noting that, with the kcore changes alone, perf tools require
> no changes to recognise the trampolines when using /proc/kcore.
> 
> Similarly, if perf tools are used with a matching kallsyms only (by denying
> access to /proc/kcore or a vmlinux image), then the kallsyms patches are
> sufficient to recognise the trampolines with no changes needed to the
> tools.
> 
> However, in the general case, when using vmlinux or dealing with
> relocations, perf tools needs memory maps for the trampolines.  Because the
> kernel text map is constructed as a special case, using the same approach
> for the trampolines means treating them as a special case also, which
> requires a number of changes to perf tools, and the remaining patches deal
> with that.
> 
> 
> Example: make a program that does lots of small syscalls e.g.
> 
> 	$ cat uname_x_n.c
> 
> 	#include <sys/utsname.h>
> 	#include <stdlib.h>
> 
> 	int main(int argc, char *argv[])
> 	{
> 		long n = argc > 1 ? strtol(argv[1], NULL, 0) : 0;
> 		struct utsname u;
> 
> 		while (n--)
> 			uname(&u);
> 
> 		return 0;
> 	}
> 
> and then:
> 
> 	sudo perf record uname_x_n 100000
> 	sudo perf report --stdio
> 
> Before the changes, there are unknown symbols:
> 
>  # Overhead  Command    Shared Object     Symbol
>  # ........  .........  ................  ..................................
>  #
>     41.91%  uname_x_n  [kernel.vmlinux]  [k] syscall_return_via_sysret
>     19.22%  uname_x_n  [kernel.vmlinux]  [k] copy_user_enhanced_fast_string
>     18.70%  uname_x_n  [unknown]         [k] 0xfffffe00000e201b
>      4.09%  uname_x_n  libc-2.19.so      [.] __GI___uname
>      3.08%  uname_x_n  [kernel.vmlinux]  [k] do_syscall_64
>      3.02%  uname_x_n  [unknown]         [k] 0xfffffe00000e2025
>      2.32%  uname_x_n  [kernel.vmlinux]  [k] down_read
>      2.27%  uname_x_n  ld-2.19.so        [.] _dl_start
>      1.97%  uname_x_n  [unknown]         [k] 0xfffffe00000e201e
>      1.25%  uname_x_n  [kernel.vmlinux]  [k] up_read
>      1.02%  uname_x_n  [unknown]         [k] 0xfffffe00000e200c
>      0.99%  uname_x_n  [kernel.vmlinux]  [k] entry_SYSCALL_64
>      0.16%  uname_x_n  [kernel.vmlinux]  [k] flush_signal_handlers
>      0.01%  perf       [kernel.vmlinux]  [k] native_sched_clock
>      0.00%  perf       [kernel.vmlinux]  [k] native_write_msr
> 
> After the changes there are not:
> 
>  # Overhead  Command    Shared Object     Symbol
>  # ........  .........  ................  ..................................
>  #
>     41.91%  uname_x_n  [kernel.vmlinux]  [k] syscall_return_via_sysret
>     24.70%  uname_x_n  [kernel.vmlinux]  [k] entry_SYSCALL_64_trampoline
>     19.22%  uname_x_n  [kernel.vmlinux]  [k] copy_user_enhanced_fast_string
>      4.09%  uname_x_n  libc-2.19.so      [.] __GI___uname
>      3.08%  uname_x_n  [kernel.vmlinux]  [k] do_syscall_64
>      2.32%  uname_x_n  [kernel.vmlinux]  [k] down_read
>      2.27%  uname_x_n  ld-2.19.so        [.] _dl_start
>      1.25%  uname_x_n  [kernel.vmlinux]  [k] up_read
>      0.99%  uname_x_n  [kernel.vmlinux]  [k] entry_SYSCALL_64
>      0.16%  uname_x_n  [kernel.vmlinux]  [k] flush_signal_handlers
>      0.01%  perf       [kernel.vmlinux]  [k] native_sched_clock
>      0.00%  perf       [kernel.vmlinux]  [k] native_write_msr

So, with just the userspace patches I get, recording with the new tool,
and then report'ing with old and new tools:

Before:

[root@...enth c]# perf-4.17.rc6.ga048a0-torvalds.master report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 83  of event 'cycles:ppp'
# Event count (approx.): 86724689
#
# Overhead  Command    Shared Object     Symbol                            
# ........  .........  ................  ..................................
#
    35.12%  uname_x_n  [kernel.vmlinux]  [k] syscall_return_via_sysret
    20.86%  uname_x_n  [unknown]         [k] 0xfffffe000005e01b
    11.09%  uname_x_n  [kernel.vmlinux]  [k] copy_user_enhanced_fast_string
     8.58%  uname_x_n  [kernel.vmlinux]  [k] __x64_sys_newuname
     4.93%  uname_x_n  libc-2.26.so      [.] __GI___uname
     2.92%  uname_x_n  ld-2.26.so        [.] dl_main
     2.66%  uname_x_n  [kernel.vmlinux]  [k] __x86_indirect_thunk_rax
     2.46%  uname_x_n  [kernel.vmlinux]  [k] do_syscall_64
     2.18%  uname_x_n  [unknown]         [k] 0xfffffe000005e01e
     2.17%  uname_x_n  uname_x_n         [.] main
     2.14%  uname_x_n  [unknown]         [k] 0xfffffe000005e00c
     1.98%  uname_x_n  [unknown]         [k] 0xfffffe000005e025
     1.37%  uname_x_n  [kernel.vmlinux]  [k] down_read
     1.27%  uname_x_n  [kernel.vmlinux]  [k] entry_SYSCALL_64
     0.23%  uname_x_n  [kernel.vmlinux]  [k] get_random_u64
     0.01%  perf       [kernel.vmlinux]  [k] end_repeat_nmi
     0.00%  perf       [kernel.vmlinux]  [k] native_write_msr


#
# (Tip: Use --symfs <dir> if your symbol files are in non-standard locations)
#

After:

[root@...enth c]# perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 83  of event 'cycles:ppp'
# Event count (approx.): 86724689
#
# Overhead  Command    Shared Object     Symbol                            
# ........  .........  ................  ..................................
#
    35.12%  uname_x_n  [kernel.vmlinux]  [k] syscall_return_via_sysret
    27.18%  uname_x_n  [kernel.vmlinux]  [k] entry_SYSCALL_64_trampoline
    11.09%  uname_x_n  [kernel.vmlinux]  [k] copy_user_enhanced_fast_string
     8.58%  uname_x_n  [kernel.vmlinux]  [k] __x64_sys_newuname
     4.93%  uname_x_n  libc-2.26.so      [.] __GI___uname
     2.92%  uname_x_n  ld-2.26.so        [.] dl_main
     2.66%  uname_x_n  [kernel.vmlinux]  [k] __x86_indirect_thunk_rax
     2.46%  uname_x_n  [kernel.vmlinux]  [k] do_syscall_64
     2.17%  uname_x_n  uname_x_n         [.] main
     1.37%  uname_x_n  [kernel.vmlinux]  [k] down_read
     1.27%  uname_x_n  [kernel.vmlinux]  [k] entry_SYSCALL_64
     0.23%  uname_x_n  [kernel.vmlinux]  [k] get_random_u64
     0.01%  perf       [kernel.vmlinux]  [k] end_repeat_nmi
     0.00%  perf       [kernel.vmlinux]  [k] native_write_msr


#
# (Tip: Generate a script for your data: perf script -g <lang>)
#
[root@...enth c]# 
[root@...enth c]# 

What am I missing while testing this,

- Arnaldo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ