lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAP-5=fXFoawZnRD22iSev5FQnx3oyFOhrPf=gZbk84qGtr9NFA@mail.gmail.com>
Date: Mon, 3 Mar 2025 23:04:59 -0800
From: Ian Rogers <irogers@...gle.com>
To: Ian Rogers <irogers@...gle.com>, Peter Zijlstra <peterz@...radead.org>, 
	Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>, 
	Mark Rutland <mark.rutland@....com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, Kan Liang <kan.liang@...ux.intel.com>, 
	John Garry <john.g.garry@...cle.com>, Will Deacon <will@...nel.org>, 
	James Clark <james.clark@...aro.org>, Mike Leach <mike.leach@...aro.org>, 
	Leo Yan <leo.yan@...ux.dev>, guoren <guoren@...nel.org>, 
	Paul Walmsley <paul.walmsley@...ive.com>, Palmer Dabbelt <palmer@...belt.com>, 
	Albert Ou <aou@...s.berkeley.edu>, Charlie Jenkins <charlie@...osinc.com>, 
	Bibo Mao <maobibo@...ngson.cn>, Huacai Chen <chenhuacai@...nel.org>, 
	Catalin Marinas <catalin.marinas@....com>, Jiri Slaby <jirislaby@...nel.org>, 
	Björn Töpel <bjorn@...osinc.com>, 
	Howard Chu <howardchu95@...il.com>, linux-kernel@...r.kernel.org, 
	linux-perf-users@...r.kernel.org, linux-arm-kernel@...ts.infradead.org, 
	"linux-csky@...r.kernel.org" <linux-csky@...r.kernel.org>, linux-riscv@...ts.infradead.org, 
	Arnd Bergmann <arnd@...db.de>
Subject: Re: [PATCH v4 00/11] perf: Support multiple system call tables in the build

On Mon, Mar 3, 2025 at 9:04 PM Ian Rogers <irogers@...gle.com> wrote:
>
> This work builds on the clean up of system call tables and removal of
> libaudit by Charlie Jenkins <charlie@...osinc.com>.
>
> The system call table in perf trace is used to map system call numbers
> to names and vice versa. Prior to these changes, a single table
> matching the perf binary's build was present. The table would be
> incorrect if tracing say a 32-bit binary from a 64-bit version of
> perf, the names and numbers wouldn't match.
>
> Change the build so that a single system call file is built and the
> potentially multiple tables are identifiable from the ELF machine type
> of the process being examined. To determine the ELF machine type, the
> executable's maps are searched and the associated DSOs ELF headers are
> read. When this fails and when live, /proc/pid/exe's ELF header is
> read. Fallback to using the perf's binary type when unknown.
>
> Remove some runtime types used by the system call tables and make
> equivalents generated at build time.
>
> v4: Add reading the e_machine from the thread's maps dsos, only read
>     from /proc/pid/exe on failure and when live as requested by
>     Namhyung. Add patches to add dso comments and remove unused
>     dso_data variables that are unused without libunwind.

This has allowed `perf trace record` (not just perf trace) to work
with binaries with an e_machine that doesn't match that of the perf
binary. An example:

Before:
```
$ file ./a.out
a.out: ELF 32-bit LSB pie executable, Intel 80386, version 1 (SYSV),
dynamically linked, interpreter /lib/ld-linux.so.2
, BuildID[sha1]=3fcd28f85a27a3108941661a91dbc675c06868f9, for
GNU/Linux 3.2.0, not stripped
$ perf trace record -- ./a.out
...
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.059 MB perf.data (60 samples) ]
$ perf trace -i perf.data
         ? (         ): a.out/914959  ... [continued]: munmap())
                                    = 0
    0.019 ( 0.001 ms): a.out/914959 recvfrom(ubuf: 0x2, size:
4160602092, flags: DONTROUTE|CTRUNC|TRUNC|DONTWAIT|EOR|>
    0.034 ( 0.002 ms): a.out/914959 lgetxattr(name: 0x2000, value:
0x3, size: 34)                         = 4160352256
    0.043 ( 0.002 ms): a.out/914959 dup2(oldfd: -134405940, newfd: 4)
                                   = -1 ENOENT>
    0.047 ( 0.009 ms): a.out/914959 preadv(fd: 4294967196, vec:
0xf7fce47f, vlen: 557056, pos_h: 4160602092) = 3
    0.058 ( 0.004 ms): a.out/914959 lgetxattr(name: 0x1b5c2, value:
0x1, size: 2)                         = 4160237568
    0.063 ( 0.000 ms): a.out/914959 lstat(filename: 0x3, statbuf:
0x1b5c2)                                = 0
    0.071 ( 0.006 ms): a.out/914959 preadv(fd: 4294967196, vec:
0xf7f9f3e0, vlen: 557056, pos_h: 4160602092) = 3
    0.078 ( 0.001 ms): a.out/914959 close(fd: 3)
                                   = 512
    0.082 ( 0.002 ms): a.out/914959 lgetxattr(name: 0x23f8d0, value:
0x1, size: 2050)                     = 4157878272
    0.084 ( 0.006 ms): a.out/914959 lgetxattr(pathname: 0xf7d66000,
name: 0x18b000, value: 0x5, size: 2066) = 4158021>
    0.091 ( 0.002 ms): a.out/914959 lgetxattr(pathname: 0xf7ef1000,
name: 0x85000, value: 0x1, size: 2066) = 41596395>
    0.093 ( 0.003 ms): a.out/914959 lgetxattr(pathname: 0xf7f76000,
name: 0x3000, value: 0x3, size: 2066) = 4160184320
    0.099 ( 0.002 ms): a.out/914959 lgetxattr(pathname: 0xf7f79000,
name: 0x98d0, value: 0x3, size: 50)   = 4160196608
    0.106 ( 0.000 ms): a.out/914959 lstat(filename: 0x3)
                                   = 0
    0.112 ( 0.001 ms): a.out/914959 mq_timedreceive(mqdes: 4287979520,
u_msg_ptr: 0xf7f9fbb0, u_msg_prio: 0xf7fdbfec,>
    0.113 ( 0.000 ms): a.out/914959 mkdirat(dfd: -134609624, pathname:
0xf7fdc910, mode: IFSOCK|ISUID|IRUSR|IWGRP|0xf>
    0.114 ( 0.000 ms): a.out/914959 process_vm_writev(pid: -134609620,
lvec: 0xc, liovcnt: 4160604432, rvec: 0xf7fa04>
    0.154 ( 0.003 ms): a.out/914959 capget(header: 4160184320,
dataptr: 8192)                             = 0
    0.158 ( 0.002 ms): a.out/914959 capget(header: 1448792064,
dataptr: 4096)                             = 0
    0.163 ( 0.002 ms): a.out/914959 capget(header: 4160593920,
dataptr: 8192)                             = 0
    0.171 ( 0.001 ms): a.out/914959 getxattr(pathname: 0x3, name:
0xff955fe4, value: 0xf7f77e14, size: 1) = 0
    0.179 ( 0.005 ms): a.out/914959 fchmod(fd: -134729728, mode:
IFLNK|IFIFO|ISGID|IRWXU|IWOTH|0x10000)   = 0
    0.193 ( 0.008 ms): a.out/914959 preadv(fd: 4294967196, vec:
0x565ac008, pos_h: 4160192020)            = 3
    0.202 ( 0.007 ms): a.out/914959 close(fd: 3)
                                   = 1436
    0.209 ( 0.017 ms): a.out/914959 stat(filename: 0x1, statbuf:
0xff9552fc)                              = 1436
    0.234 (1000.083 ms): a.out/914959 readlinkat(buf: 0xff955224,
bufsiz: 4287975964)                       = 0
```

After:
```
$ file ./a.out
a.out: ELF 32-bit LSB pie executable, Intel 80386, version 1 (SYSV),
dynamically linked, interpreter /lib/ld-linux.so.2
, BuildID[sha1]=3fcd28f85a27a3108941661a91dbc675c06868f9, for
GNU/Linux 3.2.0, not stripped
$ perf trace record -- ./a.out
...
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.059 MB perf.data (60 samples) ]
$ perf trace -i perf.data
        ? (         ): a.out/908002  ... [continued]: execve())
                                   = 0
    0.019 ( 0.001 ms): a.out/908002 brk()
                                   = 0x57680000
    0.041 ( 0.003 ms): a.out/908002 access(filename: 0xf7f0b0cc, mode:
R)                                 = -1 ENOENT>
    0.046 ( 0.008 ms): a.out/908002 openat(dfd: CWD, filename:
0xf7f0747f, flags: RDONLY|CLOEXEC|LARGEFILE) = 3
    0.055 ( 0.001 ms): a.out/908002 statx(dfd: 3, filename:
0xf7f080f6, flags: NO_AUTOMOUNT|EMPTY_PATH, mask: TYPE|MO>
    0.061 ( 0.000 ms): a.out/908002 close(fd: 3)
                                   = 0
    0.070 ( 0.006 ms): a.out/908002 openat(dfd: CWD, filename:
0xf7ed83e0, flags: RDONLY|CLOEXEC|LARGEFILE) = 3
    0.077 ( 0.001 ms): a.out/908002 read(fd: 3, buf: 0xff80ea50,
count: 512)                              = 512
    0.079 ( 0.001 ms): a.out/908002 statx(dfd: 3, filename:
0xf7f080f6, flags: NO_AUTOMOUNT|EMPTY_PATH, mask: TYPE|MO>
    0.104 ( 0.000 ms): a.out/908002 close(fd: 3)
                                   = 0
    0.112 ( 0.000 ms): a.out/908002 set_tid_address(tidptr:
0xf7ed9528)                                   = 908002 (a>
    0.113 ( 0.000 ms): a.out/908002 set_robust_list(head: 0xf7ed952c,
len: 12)                            = 0 (swappe>
    0.114 ( 0.001 ms): a.out/908002 rseq(rseq: 0xf7ed9960, rseq_len:
32, sig: 1392848979)                 = 0 (swappe>
    0.153 ( 0.003 ms): a.out/908002 mprotect(start: 0xf7eaf000, len:
8192, prot: READ)                    = 0
    0.158 ( 0.002 ms): a.out/908002 mprotect(start: 0x565ef000, len:
4096, prot: READ)                    = 0
    0.163 ( 0.002 ms): a.out/908002 mprotect(start: 0xf7f13000, len:
8192, prot: READ)                    = 0
    0.177 ( 0.005 ms): a.out/908002 munmap(addr: 0xf7ebc000, len:
112066)                                 = 0
    0.189 ( 0.009 ms): a.out/908002 openat(dfd: CWD, filename:
0x565ee008)                                = 3
    0.198 ( 0.006 ms): a.out/908002 read(fd: 3, buf: 0xff80e56c,
count: 4096)                             = 1436
    0.205 ( 0.017 ms): a.out/908002 write(fd: 1, buf: , count: 1436)
                                   = 1436
    0.229 (1000.201 ms): a.out/908002 clock_nanosleep(rqtp:
0xff80e494, rmtp: 0xff80e48c)                   = 0
 1000.486 (         ): a.out/908002 exit_group()
```

Thanks,
Ian

> v3: Add Charlie's reviewed-by tags. Incorporate feedback from Arnd
>     Bergmann <arnd@...db.de> on additional optional column and MIPS
>     system call numbering. Rebase past Namhyung's global system call
>     statistics and add comments that they don't yet support an
>     e_machine other than EM_HOST.
>
> v2: Change the 1 element cache for the last table as suggested by
>     Howard Chu, add Howard's reviewed-by tags.
>     Add a comment and apology to Charlie for not doing better in
>     guiding:
>     https://lore.kernel.org/all/20250114-perf_syscall_arch_runtime-v1-1-5b304e408e11@rivosinc.com/
>     After discussion on v1 and he agreed this patch series would be
>     the better direction.
>
> Ian Rogers (11):
>   perf dso: Move libunwind dso_data variables into ifdef
>   perf dso: kernel-doc for enum dso_binary_type
>   perf syscalltbl: Remove syscall_table.h
>   perf trace: Reorganize syscalls
>   perf syscalltbl: Remove struct syscalltbl
>   perf dso: Add support for reading the e_machine type for a dso
>   perf thread: Add support for reading the e_machine type for a thread
>   perf trace beauty: Add syscalltbl.sh generating all system call tables
>   perf syscalltbl: Use lookup table containing multiple architectures
>   perf build: Remove Makefile.syscalls
>   perf syscalltbl: Mask off ABI type for MIPS system calls
>
>  tools/perf/Makefile.perf                      |  10 +-
>  tools/perf/arch/alpha/entry/syscalls/Kbuild   |   2 -
>  .../alpha/entry/syscalls/Makefile.syscalls    |   5 -
>  tools/perf/arch/alpha/include/syscall_table.h |   2 -
>  tools/perf/arch/arc/entry/syscalls/Kbuild     |   2 -
>  .../arch/arc/entry/syscalls/Makefile.syscalls |   3 -
>  tools/perf/arch/arc/include/syscall_table.h   |   2 -
>  tools/perf/arch/arm/entry/syscalls/Kbuild     |   4 -
>  .../arch/arm/entry/syscalls/Makefile.syscalls |   2 -
>  tools/perf/arch/arm/include/syscall_table.h   |   2 -
>  tools/perf/arch/arm64/entry/syscalls/Kbuild   |   3 -
>  .../arm64/entry/syscalls/Makefile.syscalls    |   6 -
>  tools/perf/arch/arm64/include/syscall_table.h |   8 -
>  tools/perf/arch/csky/entry/syscalls/Kbuild    |   2 -
>  .../csky/entry/syscalls/Makefile.syscalls     |   3 -
>  tools/perf/arch/csky/include/syscall_table.h  |   2 -
>  .../perf/arch/loongarch/entry/syscalls/Kbuild |   2 -
>  .../entry/syscalls/Makefile.syscalls          |   3 -
>  .../arch/loongarch/include/syscall_table.h    |   2 -
>  tools/perf/arch/mips/entry/syscalls/Kbuild    |   2 -
>  .../mips/entry/syscalls/Makefile.syscalls     |   5 -
>  tools/perf/arch/mips/include/syscall_table.h  |   2 -
>  tools/perf/arch/parisc/entry/syscalls/Kbuild  |   3 -
>  .../parisc/entry/syscalls/Makefile.syscalls   |   6 -
>  .../perf/arch/parisc/include/syscall_table.h  |   8 -
>  tools/perf/arch/powerpc/entry/syscalls/Kbuild |   3 -
>  .../powerpc/entry/syscalls/Makefile.syscalls  |   6 -
>  .../perf/arch/powerpc/include/syscall_table.h |   8 -
>  tools/perf/arch/riscv/entry/syscalls/Kbuild   |   2 -
>  .../riscv/entry/syscalls/Makefile.syscalls    |   4 -
>  tools/perf/arch/riscv/include/syscall_table.h |   8 -
>  tools/perf/arch/s390/entry/syscalls/Kbuild    |   2 -
>  .../s390/entry/syscalls/Makefile.syscalls     |   5 -
>  tools/perf/arch/s390/include/syscall_table.h  |   2 -
>  tools/perf/arch/sh/entry/syscalls/Kbuild      |   2 -
>  .../arch/sh/entry/syscalls/Makefile.syscalls  |   4 -
>  tools/perf/arch/sh/include/syscall_table.h    |   2 -
>  tools/perf/arch/sparc/entry/syscalls/Kbuild   |   3 -
>  .../sparc/entry/syscalls/Makefile.syscalls    |   5 -
>  tools/perf/arch/sparc/include/syscall_table.h |   8 -
>  tools/perf/arch/x86/entry/syscalls/Kbuild     |   3 -
>  .../arch/x86/entry/syscalls/Makefile.syscalls |   6 -
>  tools/perf/arch/x86/include/syscall_table.h   |   8 -
>  tools/perf/arch/xtensa/entry/syscalls/Kbuild  |   2 -
>  .../xtensa/entry/syscalls/Makefile.syscalls   |   4 -
>  .../perf/arch/xtensa/include/syscall_table.h  |   2 -
>  tools/perf/builtin-trace.c                    | 290 +++++++++++-------
>  tools/perf/scripts/Makefile.syscalls          |  61 ----
>  tools/perf/scripts/syscalltbl.sh              |  86 ------
>  tools/perf/trace/beauty/syscalltbl.sh         | 274 +++++++++++++++++
>  tools/perf/util/dso.c                         |  54 ++++
>  tools/perf/util/dso.h                         |  56 ++++
>  tools/perf/util/syscalltbl.c                  | 148 ++++-----
>  tools/perf/util/syscalltbl.h                  |  22 +-
>  tools/perf/util/thread.c                      |  80 +++++
>  tools/perf/util/thread.h                      |  14 +-
>  56 files changed, 756 insertions(+), 509 deletions(-)
>  delete mode 100644 tools/perf/arch/alpha/entry/syscalls/Kbuild
>  delete mode 100644 tools/perf/arch/alpha/entry/syscalls/Makefile.syscalls
>  delete mode 100644 tools/perf/arch/alpha/include/syscall_table.h
>  delete mode 100644 tools/perf/arch/arc/entry/syscalls/Kbuild
>  delete mode 100644 tools/perf/arch/arc/entry/syscalls/Makefile.syscalls
>  delete mode 100644 tools/perf/arch/arc/include/syscall_table.h
>  delete mode 100644 tools/perf/arch/arm/entry/syscalls/Kbuild
>  delete mode 100644 tools/perf/arch/arm/entry/syscalls/Makefile.syscalls
>  delete mode 100644 tools/perf/arch/arm/include/syscall_table.h
>  delete mode 100644 tools/perf/arch/arm64/entry/syscalls/Kbuild
>  delete mode 100644 tools/perf/arch/arm64/entry/syscalls/Makefile.syscalls
>  delete mode 100644 tools/perf/arch/arm64/include/syscall_table.h
>  delete mode 100644 tools/perf/arch/csky/entry/syscalls/Kbuild
>  delete mode 100644 tools/perf/arch/csky/entry/syscalls/Makefile.syscalls
>  delete mode 100644 tools/perf/arch/csky/include/syscall_table.h
>  delete mode 100644 tools/perf/arch/loongarch/entry/syscalls/Kbuild
>  delete mode 100644 tools/perf/arch/loongarch/entry/syscalls/Makefile.syscalls
>  delete mode 100644 tools/perf/arch/loongarch/include/syscall_table.h
>  delete mode 100644 tools/perf/arch/mips/entry/syscalls/Kbuild
>  delete mode 100644 tools/perf/arch/mips/entry/syscalls/Makefile.syscalls
>  delete mode 100644 tools/perf/arch/mips/include/syscall_table.h
>  delete mode 100644 tools/perf/arch/parisc/entry/syscalls/Kbuild
>  delete mode 100644 tools/perf/arch/parisc/entry/syscalls/Makefile.syscalls
>  delete mode 100644 tools/perf/arch/parisc/include/syscall_table.h
>  delete mode 100644 tools/perf/arch/powerpc/entry/syscalls/Kbuild
>  delete mode 100644 tools/perf/arch/powerpc/entry/syscalls/Makefile.syscalls
>  delete mode 100644 tools/perf/arch/powerpc/include/syscall_table.h
>  delete mode 100644 tools/perf/arch/riscv/entry/syscalls/Kbuild
>  delete mode 100644 tools/perf/arch/riscv/entry/syscalls/Makefile.syscalls
>  delete mode 100644 tools/perf/arch/riscv/include/syscall_table.h
>  delete mode 100644 tools/perf/arch/s390/entry/syscalls/Kbuild
>  delete mode 100644 tools/perf/arch/s390/entry/syscalls/Makefile.syscalls
>  delete mode 100644 tools/perf/arch/s390/include/syscall_table.h
>  delete mode 100644 tools/perf/arch/sh/entry/syscalls/Kbuild
>  delete mode 100644 tools/perf/arch/sh/entry/syscalls/Makefile.syscalls
>  delete mode 100644 tools/perf/arch/sh/include/syscall_table.h
>  delete mode 100644 tools/perf/arch/sparc/entry/syscalls/Kbuild
>  delete mode 100644 tools/perf/arch/sparc/entry/syscalls/Makefile.syscalls
>  delete mode 100644 tools/perf/arch/sparc/include/syscall_table.h
>  delete mode 100644 tools/perf/arch/x86/entry/syscalls/Kbuild
>  delete mode 100644 tools/perf/arch/x86/entry/syscalls/Makefile.syscalls
>  delete mode 100644 tools/perf/arch/x86/include/syscall_table.h
>  delete mode 100644 tools/perf/arch/xtensa/entry/syscalls/Kbuild
>  delete mode 100644 tools/perf/arch/xtensa/entry/syscalls/Makefile.syscalls
>  delete mode 100644 tools/perf/arch/xtensa/include/syscall_table.h
>  delete mode 100644 tools/perf/scripts/Makefile.syscalls
>  delete mode 100755 tools/perf/scripts/syscalltbl.sh
>  create mode 100755 tools/perf/trace/beauty/syscalltbl.sh
>
> --
> 2.48.1.711.g2feabab25a-goog
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ